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Preface 



Purpose and Emphasis. Mechanics not only is the oldest branch of physics but 
was and still is the basis for all of theoretical physics. Quantum mechanics can 
hardly be understood, perhaps cannot even be formulated, without a good knowl- 
edge of general mechanics. Field theories such as electrodynamics borrow their 
formal framework and many of their building principles from mechanics. In short, 
throughout the many modern developments of physics where one frequently turns 
back to the principles of classical mechanics its model character is felt. For this 
reason it is not surprising that the presentation of mechanics reflects to some ex- 
tent the development of modern physics and that today this classical branch of 
theoretical physics is taught rather differently than at the time of Arnold Som- 
merfeld, in the 1920s, or even in the 1950s, when more emphasis was put on the 
theory and the applications of partial-differential equations. Today, symmetries and 
invariance principles, the structure of the space-time continuum, and the geomet- 
rical structure of mechanics play an important role. The beginner should realize 
that mechanics is not primarily the art of describing block-and-tackles, collisions 
of billiard balls, constrained motions of the cylinder in a washing machine, or bi- 
cycle riding. However fascinating such systems may be, mechanics is primarily 
the field where one learns to develop general principles from which equations of 
motion may be derived, to understand the importance of symmetries for the dy- 
namics, and, last but not least, to get some practice in using theoretical tools and 
concepts that are essential for all branches of physics. 

Besides its role as a basis for much of theoretical physics and as a training 
ground for physical concepts, mechanics is a fascinating field in itself. It is not easy 
to master, for the beginner, because it has many different facets and its structure is 
less homogeneous than, say, that of electrodynamics. On a first assault one usually 
does not fully realize both its charm and its difficulty. Indeed, on returning to 
various aspects of mechanics, in the course of one’s studies, one will be surprised 
to discover again and again that it has new facets and new secrets. And finally, one 
should be aware of the fact that mechanics is not a closed subject, lost forever in 
the archives of the nineteenth century. As the reader will realize in Chap. 6, if he 
or she has not realized it already, mechanics is an exciting field of research with 
many important questions of qualitative dynamics remaining unanswered. 

Structure of the Book and a Reading Guide. Although many people prefer 
to skip prefaces, I suggest that the reader, if he or she is one of them, make an 
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exception for once and read at least this section and the next. The short introduc- 
tions at the beginning of each chapter are also recommended because they give a 
summary of the chapter’s content. 

Chapter 1 starts from Newton’s equations and develops the elementary dynam- 
ics of one-, two-, and many-body systems for unconstrained systems. This is the 
basic material that could be the subject of an introductory course on theoretical 
physics or could serve as a text for an integrated (experimental and theoretical) 
course. 

Chapter 2 is the “classical” part of general mechanics describing the principles 
of canonical mechanics following Euler, Lagrange, Hamilton, and Jacobi. Most of 
the material is a MUST. Nevertheless, the sections on the symplectic structure 
of mechanics (Sect. 2.28) and on perturbation theory (Sects. 2.38-2.40) may be 
skipped on a first reading. 

Chapter 3 describes a particularly beautiful application of classical mechanics: 
the theory of spinning tops. The rigid body provides an important and highly non- 
trivial example of a motion manifold that is not a simple Euclidean space K 2 ' , 
where / is the number of degrees of freedom. Its rotational part is the manifold of 
SO(3), the rotation group in three real dimensions. Thus, the rigid body illustrates 
a Lie group of great importance in physics within a framework that is simple and 
transparent. 

Chapter 4 deals with relativistic kinematics and dynamics of pointlike objects 
and develops the elements of special relativity. This may be the most difficult part 
of the book, as far as the physics is concerned, and one may wish to return to it 
when studying electrodynamics. 

Chapter 5 is the most challenging in terms of the mathematics. It develops 
the basic tools of differential geometry that are needed to formulate mechanics in 
this setting. Mechanics is then described in geometrical terms and its underlying 
structure is worked out. This chapter is conceived such that it may help to bridge the 
gap between the more “physical” texts on mechanics and the modern mathematical 
literature on this subject. Although it may be skipped on a first reading, the tools 
and the language developed here are essential if one wishes to follow the modern 
literature on qualitative dynamics. 

Chapter 6 provides an introduction to one of the most fascinating recent de- 
velopments of classical dynamics: stability and deterministic chaos. It defines and 
illustrates all important concepts that are needed to understand the onset of chaotic 
motion and the quantitative analysis of unordered motions. It culminates in a few 
examples of chaotic motion in celestial mechanics. 

Chapter 7, finally, gives a short introduction to continuous systems, i.e. systems 
with an infinite number of degrees of freedom. 

Exercises and Practical Examples. In addition to the exercises that follow 
Chaps. 1-6, the book contains a number of practical examples in the form of exer- 
cises followed by complete solutions. Most of these are meant to be worked out on 
a personal computer, thereby widening the range of problems that can be solved 
with elementary means, beyond the analytically integrable ones. I have tried to 
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choose examples simple enough that they can be made to work even on a pro- 
grammable pocket computer and in a spirit, I hope, that will keep the reader from 
getting lost in the labyrinth of computional games. 

Length of this Book. Clearly there is much more material here than can be 
covered in one semester. The book is designed for a two-semester course (i.e., typ- 
ically, an introductory course followed by a course on general mechanics). Even 
then, a certain choice of topics will have to be made. However, the text is suffi- 
ciently self-contained that it may be useful for complementary reading and indi- 
vidual study. 

Mathematical Prerequisites. A physicist must acquire a certain flexibility in 
the use of mathematics. On the one hand, it is impossible to carry out all steps in 
a deduction or a proof, since otherwise one will not get very far with the physics 
one wishes to study. On the other hand, it is indispensable to know analysis and 
linear algebra in some depth, so as to be able to fill in the missing links in a logical 
deduction. Like many other branches of physics, mechanics makes use of many 
and various disciplines of mathematics, and one cannot expect to have all the tools 
ready before beginning its study. In this book I adopt the following, somewhat gen- 
erous attitude towards mathematics. In many places, the details are worked out to a 
large extent; in others I refer to well-known material of linear algebra and analysis. 
In some cases the reader might have to return to a good text in mathematics or 
else, ideally, derive certain results for him- or herself. In this connection it might 
also be helpful to consult the appendix at the end of the book. 

General Comments and Acknowledgements. This fourth English edition fol- 
lows closely the seventh, enlarged, German edition. As compared to the third En- 
glish edition published in 1999, there are a number revisions and additions. Some 
of these are the following. In Chap. 1 more motivation for the introduction of phase 
space at this early stage is given. A paragraph on the notion of hodograph is added 
which emphasizes the special nature of Keplerian bound orbits. Chap. 2 is supple- 
mented by some extensions and further explanations, specifically in relation with 
Legendre transformation. Also, a new section on a generalized version of Noether’s 
theorem was added, together with some enlightening examples. In Chap. 3 more 
examples are given for inertia tensors and the use of Steiner’s theorem. Here and 
in Chap. 4 the symbolic “bra” and “ket” notation is introduced in characterizing 
vectors and their duals. A new feature is a name index which, in addition to the 
index, may be helpful in locating quickly specific items in mechanics. The book 
contains the solutions to all exercises, as well as some historical notes on scientists 
who made important contributions to mechanics and to the mathematics on which 
it rests. 

This book was inspired by a two-semester course on general mechanics that I 
have taught on and off over the last twenty years at the Johannes Gutenberg Uni- 
versity at Mainz and by seminars on geometrical aspects of mechanics. I thank my 
collaborators, colleagues, and students for stimulating questions, helpful remarks, 
and profitable discussions. I was happy to realize that the German original, since 
its first appearance in October 1988, has become a standard text at German speak- 
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ing universities and I can only hope that it will continue to be equally successful 
in its English version. I am grateful for the many encouraging reactions and sug- 
gestions I have received over the years. Among those to whom I owe special grati- 
tude are P. Hagedorn, K. Hepp, D. Kastler, H. Leutwyler, L. Okun, N. Papadopoulos, 
J.M. Richard, G. Schuster, J. Smith, M. Stingl, N.Straumann, W. Thirring, E. Vogt, 
and V. Vento. Special thanks are due to my former student R. Schopf who collab- 
orated on the earlier version of the solutions to the exercises. I thank J. Wisdom 
for his kind permission to use four of his figures illustrating chaotic motions in 
the solar system, and P. Beckmann who provided the impressive illustrations for 
the logistic equation and who advised me on what to say about them. 

The excellent cooperation with the team of Springer- Verlag is gratefully ac- 
knowledged. Last but not least, I owe special thanks to Dorte for her patience and 
encouragement. 

As with the German edition, I dedicate this book to all those students who 
wish to study mechanics at some depth. If it helps to make them aware of the 
fascination of this beautiful field and of physics in general then one of my goals 
in writing this book is reached. 

Mainz, August 2004 Florian Scheck 



I will keep track of possible errata on a page attached to my home page. The latter can be accessed via 
http://www.thep.physik.uni-mainz.de/staff.html . 
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1. Elementary Newtonian Mechanics 



This chapter deals with the kinematics and the dynamics of a finite number of 
mass points that are subject to internal, and possibly external, forces, but whose 
motions are not further constrained by additional conditions on the coordinates. 
(The mathematical pendulum will be an exception). Constraints such as requiring 
some mass points to follow given curves in space, to keep their relative distance 
fixed, or the like, are introduced in Chap. 2. Unconstrained mechanical systems 
can be studied directly by means of Newton’s equations and do not require the 
introduction of new, generalized coordinates that incorporate the constraints and 
are dynamically independent. This is what is meant by “elementary” in the heading 
of this chapter - though some of its content is not elementary at all. In particular, 
at an early stage, we shall discover an intimate relationship between invariance 
properties under coordinate transformations and conservation laws of the theory, 
which will turn out to be a basic, constructive element for all of mechanics and 
which, for that matter, will be felt like a cantus firmus 1 throughout the whole of 
theoretical physics. The first, somewhat deeper analysis of these relations already 
leads one to consider the nature of the spatial and temporal manifolds that carry 
mechanical motions, thereby entering a discussion that is of central importance in 
present-day physics at both the smallest and the largest dimensions. 

We also introduce the notion of phase space, i.e. the description of physical 
motions in an abstract space spanned by coordinates and corresponding momenta, 
and thus prepare the ground for canonical mechanics in the formulation of Hamil- 
ton and Jacobi. 

We begin with Newton’s fundamental laws, which we interpret and translate 
into precise analytical statements. They are then illustrated by a number of exam- 
ples and some important applications. 



1.1 Newton’s Laws (1687) and Their Interpretation 

We begin by stating Newton’s fundamental laws in a formulation that is close to 
the original one. They are as follows: 

1 cantus firmus : a preexisting melody, such as a plainchant excerpt, which underlies a polyphonic 
musical composition. 
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1. Elementary Newtonian Mechanics 



I. Every body continues in its state of rest or of uniform rectilinear motion, except 
if it is compelled by forces acting on it to change that state. 

II. The change of motion is proportional to the applied force and takes place in 
the direction of the straight line along which that force acts. 

III. To every action there is always an equal and contrary reaction; or, the mutual 
actions of any two bodies are always equal and oppositely directed along the 
same straight line. 

In order to understand these fundamental laws and to learn how to translate 
them into precise analytical expressions we first need to interpret them and to go 
through a number of definitions. On the one hand we must clarify what is meant 
by notions such as “body”, “state of motion”, “applied force”, etc. On the other 
hand we wish to collect a few (provisional) statements and assumptions about the 
space-time continuum in which mechanical motions take place. This will enable 
us to translate Newton’s laws into local equations, which can then be tested, in a 
quantitative manner, by comparison with experiment. 

Initially, “bodies” will be taken to be mass points, i.e. pointlike particles of 
mass m. These are objects that have no spatial extension but do carry a finite 
mass. While this idealization is certainly plausible for an elementary particle like 
the electron, in studying collisions on a billiard table, or relative motions in the 
planetary system, it is not clear, a priori, whether the billiard balls, the sun, or the 
planets can be taken to be massive but pointlike, i.e. without spatial extension. For 
the moment and in order to give at least a preliminary answer, we anticipate two 
results that will be discussed and proved later. 

(i) To any finite mass distribution (i.e. a mass distribution that can be com- 
pletely enclosed by a sphere of finite radius), or to any finite system of mass points, 
one can assign a center of gravity to which the resultant of all external forces ap- 
plies. This center behaves like a pointlike particle of mass M, under the action of 
that resultant, M being the total mass of the system (see Sects. 1.9 and 3.8). 

(ii) A finite mass distribution of total mass M that looks the same in every 
direction (one says it is spherically symmetric ) creates a force field in the outer, 
mass-free space that is identical to that of a pointlike particle of mass M located 
at its center of symmetry (Sect. 1.30). A spherical sun acts on a planet that does 
not penetrate it like a mass point situated at its center. In turn, the planet can be 
treated as a pointlike mass, too, as long as it is spherically symmetric. 




Fig. 1.1. Example of an orbit with accelerated motion. 
While the orbit curve is a coordinate independent, geo- 
metric object, its description by the position vector r(t ) 
depends on the choice of origin and coordinates 
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In Law I, motion, or state of motion, refers to the trajectory r(t) of the mass 
point in coordinate space R. 3 , where r describes its position at a given time t. Fig- 
ure 1. 1 shows an example for an arbitrary trajectory in three-dimensional space. 
State of rest means that r(t) — 0 for all times f, while uniform rectilinear motion 
is that motion along a straight line with constant velocity. It is important to real- 
ize that motion is always relative motion of (at least two) physical systems. For 
instance, a particle moves relative to an observer (i.e. a measuring apparatus). It is 
only meaningful to talk about the relative positions of particle A and particle B, 
or about the position of particle A with respect to an observer, at any fixed time. 

Experimental experience allows us to assume that the space in which the phys- 
ical motion of a mass point takes place is homogeneous and isotropic and that it 
has the structure of a three-dimensional Euclidean space E 3 . Homogeneous here 
means that no point of E 3 is singled out in any respect. Isotropic in turn means that 
there is no preferred direction either (more on this will be said in Sect. 1.14). Thus 
the space of motions of the particle is an affine space, in agreement with physical 
intuition: giving the position xit) e E 3 of a particle at time t is not meaningful, 
while giving this x{t) relative to the position y(t) of an observer (at the same time) 
is. If we endow the affine space with an origin, e.g. by relating all positions to a 
given observer, the space is made into the real three-dimensional vector space K 3 . 
This is a metric space on which scalar and cross products of vectors are defined 
as usual and for which base systems can be chosen in a variety of ways. 

In nonrelativistic physics time plays a special role. Daily experience tells us 
that time appears to be universal in the sense that it runs uniformly without being 
influenced by physical events. In order to sharpen this statement one may think of 
any moving particle as being accompanied on its journey by its own clock, which 
measures what is called the particle’s proper time r. On his clock an observer B 
then measures the time 



r W =a W T + /j (*) . 



( 1 . 1 ) 



Here a (B) is a positive constant indicating the (relative) unit that B chooses in mea- 
suring time, while fi ' H 1 indicates where B has chosen his origin of time, relative 
to that of the moving clock. 

Equation (1.1) can also be written in the form of a differential equation, 



dV B > 

dr 2 



= 0, 



( 1 . 2 ) 



which is independent of the constants a [B) and ffi B) . While (1.1) relates the proper 
time to that of a specific observer, (1.2) contains the statement that is of inter- 
est here for all possible observers. We conclude that time is described by a one- 
dimensional affine space, or, after having chosen a zero, by the real line R. For 
the sake of clarity we shall sometimes also write K t (“t” for “time”). 2 



- It would be premature to conclude that the space-time of nonrelativistic physics is simply R 2 x Rt 
as long as one does not know the symmetry structure that is imposed on it by the dynamics. We 
return to this question in Sect. 1.14. In Sect. 4.7 we analyze the analogous situation in relativistic 
mechanics. 
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The trajectory r(t) is often described in terms of a specific coordinate system. 
It may be expressed by means of Cartesian coordinates, 

r(t) = (x(t), y(t), z(t)) , 

or by means of spherical coordinates 

r(t) : {r(t),<p(t),0(t)} , 

or any other coordinates that are adapted to the system one is studying. 
Examples of motions in space are: 

(i) r(t) = (v x t+x o, 0, v z t+zo—gt 2 /2) in Cartesian coordinates. This describes, 
in the x -direction, uniform motion with constant velocity v x , a state of rest 
in the y-direction, and, in the z-direction, the superposition of the uniform 
motion with velocity v z and free fall in the gravitational field of the earth. 

(ii) r(t) = (jr (?) = R cos (cot + </>o), y(t) = R sin (cut + </>o), 0). 

(iii) r(t) : ( r(t ) = R, (pit ) — (po + cot, 0). 

Examples (ii) and (iii) represent the same motion in different coordinates: the 
trajectory is a circle of radius R in the (x, y)-plane that the particle follows with 
constant angular velocity &>. 

From the knowledge of the function r(t) follow the velocity 

v it)=^~ r it) = r (1.3) 

dr 

and the acceleration 
, . def d 

a (t) = — v(t ) = v = r . (1.4) 

dr 

In Example (i) above, v = (v x , 0, v- — gt) and a — (0, 0, —g). In Ex- 
amples (ii) and (iii) we have v — cl>R(— sinicot + 4> o), cos(&>r + <po), 0) and 
a — arR(— cos {cot + 0o), — sin(&)r + <po), 0), i.e. v has magnitude coR and di- 
rection tangent to the circle of motion. The acceleration has magnitude a> 2 R and 
is directed towards the center of that circle. 

The velocity vector is a tangent vector to the trajectory and therefore lies in 
the tangent space of the manifold of position vectors, at the point r . If r e R 3 , 
this tangent space is also an M 3 and can be identified with the space of positions. 
There are cases, however, where we have to distinguish between the position space 
and its tangent spaces. A similar remark applies to the acceleration vector. 



1.2 Uniform Rectilinear Motion and Inertial Systems 

Definition. Uniform rectilinear motion is a state of motion with constant velocity 
and therefore vanishing acceleration, r = 0. 
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The trajectory has the general form 

r(t) = r° + v°t , (1.5) 

where r° denotes the initial position, v° the initial velocity, r° = r(t = 0), and 
v° — v(t — 0). The velocity is constant at all times and the acceleration is zero: 

i >(t) = r(t ) = v° , 

aft) = r{t) = 0 . (1.6) 

We remark that (1.6) are differential equations characteristic for uniform mo- 
tion. A specific solution is only defined if the initial conditions r( 0) = r°, 
v(0) — v° are given. Equation (1.6) is a linear, homogeneous system of differ- 
ential equations of second order; i>° and r° are integration constants that can be 
freely chosen. 

Law I states that (1.5) with arbitrary constants r° and u° is the characteristic 
state of motion of a mechanical body to which no forces are applied. This state- 
ment supposes that we have already chosen a certain frame of reference, or a class 
of frames, in coordinate space. Indeed, if all force-free motions are described by 
the differential equation r = 0 in the reference frame Ko, this is not true in a 
frame K that is accelerated with respect to Ko, (see Sect. 1.25 for the case of ro- 
tating frames). In K there will appear fictitious forces such as the centrifugal and 
the Coriolis forces, and, as a consequence, force-free motion will look very com- 
plicated. There exist, in fact, specific frames of reference with respect to which 
force-free motion is always uniform and rectilinear. They are defined as follows. 

Definition. Reference frames with respect to which Law I has the analytical form 
r(t) — 0 are called inertial frames. 

In fact, the first of Newton’s laws defines the class of inertial frames. This is 
the reason why it is important in its own right and is more than just a special case 
of Law II. With respect to inertial frames the second law then has the form 

mr{t) — K , 



where K is the resultant of the forces applied to the body. Thus Newton’s second 
law takes a particularly simple form in inertial systems. If one chooses to describe 
the motion by means of reference frames that are accelerated themselves, this fun- 
damental law will take a more complicated form although it describes the same 
physical situation. Besides the resultant K there will appear additional, fictitious 
forces that depend on the momentary acceleration of the noninertial system. 

The inertial systems are particularly important because they single out the group 
of those transformations of space and time for which the equations of motion (i.e. 
the equations that follow from Newton’s laws) are form invariant (i.e, the struc- 
ture of the equations remains the same). In Sect. 1.13 we shall construct the class 
of all inertial frames. The following proposition is particularly important in this 
connection. 
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1.3 Inertial Frames in Relative Motion 



Let K be an inertial frame. Any frame K' that moves with constant velocity 
w relative to K is also inertial (see Fig. 1.2). 




Fig. 1.2. If K is an inertial system, 
then so is every K' whose axes are 
parallel to those of K and which 
moves with constant velocity w 
relative to K 



Proof. The position vector r(t ) with respect to K becomes r'(t) = r(t ) — wt with 
respect to K'. Since w is constant, there follows r'(t) — f(t) — 0. All force-free 
motions satisfy the same differential equation (1.6) in either reference system, both 
of which are therefore inertial frames. □ 

The individual solution (1.5) looks different in K than in K': if the systems 
coincide at t = 0. the initial condition (r°. v°) with respect to the first is equivalent 
to the initial condition (r°, v (} — w) with respect to the second. 



1.4 Momentum and Force 

In Law II we identify “motion” with the momentum'. 

p(t) = f mr(t) = mv , (1.7) 

i.e. the product of inertial mass and momentary velocity. The second law, when 
expressed as a formula, then reads 3 

%-p(t) = K(r,r,t) (1.8a) 

at 



3 



We have interpreted “change of motion” as the time derivative of the momentum. Law II does 
not say this so clearly. 
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or, if the inertial mass is independent of the state of motion, 

mr(t) — K(r, r, t ) . (1.8b) 

If the second form (1.8b) applies, the proportionality factor m , the inertial mass 
of the body, can be determined, with respect to a sample body of reference mass 
m i, by alternatively exposing the body and the sample to the same force field and 
by comparing the resultant accelerations: their ratio fulfills m/m\ — |r (1 )|/|r|. 

The mass of macroscopic bodies can be changed by adding or removing mat- 
ter. In nonrelativistic physics at macroscopic dimensions mass is an additive quan- 
tity; that is, if one joins two bodies of masses m\ and mi, their union has mass 
(mi T m 2 ) • Another way of expressing this fact is to say that mass is an extensive 
quantity. 

In the realm of physics at microscopic dimensions one finds that mass is an in- 
variant, characteristic property. Every electron has the mass m t — 9.1 1 x 10“ 31 kg, 
a hydrogen atom has a fixed mass, which is the same for any other hydrogen atom, 
all photons are strictly massless, etc. 

The relationship (1.7) holds only as long as the velocity is small as compared 
to the speed of light c — 3 x 10 8 m/s. If this is not the case the momentum is 
given by a more complicated formula, viz. 



Pit) = 



m 

y/l - V 2 (t)/c 2 



V(t) , 



(1.9) 



where c is the velocity of light (see Chap. 4). For |v| c the expressions (1.9) 
and (1.7) differ by terms of order 0(\v\ 2 /c 2 ). For these reasons - mass being an 
invariant property of elementary particles, and its role in the limit of small veloc- 
ity one also calls the quantity m the rest mass of the particle. In the 

older literature, when considering the quotient m/^f 1 — v 2 (t)/c 2 , one sometimes 
talked about this as the moving, velocity dependent mass. It is advisable, how- 
ever, to avoid this distinction altogether because it blurs the invariant nature of 
rest mass and hides an essential difference between relativistic and nonrelavistic 
kinematics. In talking about mass we will always have in mind the invariant rest 
mass. 

We assume the force K(r , r , t) to be given a priori. More precisely we are talk- 
ing about a force field, i.e. a vector- valued function over the space of coordinates 
and, if the forces are velocity dependent, the space of velocities. At every point 
of this six-dimensional space where K is defined this function gives the force that 
acts on the mass point at time t. Such force fields, in general, stem from other 
physical bodies, which act as their sources. Force fields are vector fields. This 
means that different forces that are applied at the same point in space, at a given 
time, must be added vectorially. 

In Law III the notion “action” stands for the (internal) force that one body 
exerts on another. Consider a system of finitely many mass points with masses 
nij and position vectors r,(f), i — 1 , 2, .... 11 . Let Fa be the force that particle 
i exerts on particle k. One then has Fa — — Fki ■ Forces of this kind are called 
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internal forces of the n -particle system. This distinction is necessary if one wishes 
to describe the interaction with further, possibly very heavy, external objects by 
means of external forces. This is meaningful, for instance, whenever the reaction 
on the external objects is negligible. One should keep in mind that the distinction 
between internal and external forces is artificial and is made only for practical 
reasons. The source of an external force can always be defined to be part of the 
system, thus converting the external to an internal force. Conversely, the example 
discussed in Sect. 1.7 below shows that the two-body problem with internal forces 
can be reduced to an effective one-body problem through separation of the center- 
of-mass motion, where a fictitious particle of mass jx — m\m 2 /(rn\ +m 2 ) moves 
in the field of an external force. 



1.5 Typical Forces. A Remark About Units 

The two most important fundamental forces of nature are the gravitational force 
and the Coulomb force. The other fundamental forces known to us, i.e. those de- 
scribing the strong and the weak interactions of elementary particles, have very 
small ranges of about 10“ 15 m and 10“ 18 m, respectively. Therefore, they play no 
role in mechanics at laboratory scales or in the planetary system. 

The gravitational force is always attractive and has the form 

F ki = — G m i m k ■ d-10) 

k; - ruV 

This is the force that particle k with mass applies to particle i whose mass 
is m j . It points along the straight line that connects the two, is directed from i 
to k, and is inversely proportional to the square of the distance between i and k. 
G is Newton’s gravitational constant. Apart from G (1.10) contains the gravita- 
tional masses (heavy masses or weights) m,- and m^. These are to be understood as 
parameters characterizing the strength of the interaction. Experiment tells us that 
gravitational and inertial masses are proportional to one another (“all bodies fall 
at the same speed”), i.e, that they are essentially of the same nature. This highly 
remarkable property of gravitation is the starting point for Einstein’s equivalence 
principle and for the theory of general relativity. If read as the gravitational mass, 
nij determines the strength of the coupling of particle i to the force field created by 
particle k. If understood as being the inertial mass, it determines the local acceler- 
ation in a given force field. (The third of Newton’s laws ensures that the situation 
is symmetric in i and k. so that the discussion of particle k in the field of particle 
i is exactly the same.) 

In the case of the Coulomb force, matters are different: here the strength is 
determined by the electric charges e, and ek of the two particles, 

r, — r k 

Fki = Kceiek - 7 t , 

k; - fkf 



( 1 . 11 ) 
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which are not correlated (for macroscopic bodies) to their masses. A ball made of 
iron with given mass may be uncharged or may carry positive or negative charges. 
The strength as well as the sign of the force are determined by the charges. For 
sign e, — sign ek it is repulsive; for sign e,- = —signer it is attractive. If one 
changes the magnitude of ek, for instance, the strength varies proportionally to ek- 
The accelerations induced by this force, however, are determined by the inertial 
masses as before. The parameter kq is a constant that depends on the units used 
(see below). 

Apart from these fundamental forces we consider many more forms of forces 
that may occur or may be created in the macroscopic world of the laboratory. 
Specific examples are the harmonic force, which is always attractive and whose 
magnitude is proportional to the distance (Hooke’s law), or those force fields which 
arise from the variety of electric and magnetic fields that can be created by all kinds 
of arrangements of conducting elements and coils. Therefore it is meaningful to 
regard the force field on the right-hand side of (1.8) as an independent element of 
the theory that can be chosen at will. The equation of motion (1.8) describes, in 
differential form, how the particle of mass m will move under the influence of the 
force field. If the situation is such that the particle does not disturb the source of the 
force field in any noticeable way (in the case of gravitation this is true whenever 
m « Msource) the particle may be taken as a probe: by measuring its accelerations 
one can locally scan the force field. If this is not a good approximation. Law III 
becomes important and one should proceed as in Sect. 1.7 below. 

We conclude this section with a remark about units. To begin with, it is clear 
that we must define units for three observable quantities: time, length in coordinate 
space, and mass. We denote their dimensions by T, L, and M, respectively: 

[t] = T , [r] = L , [m] = M , 

the symbol [jc] meaning the physical dimension of the quantity x. The dimensions 
and measuring units for all other quantities that occur in mechanics can be reduced 
to these basic units and are therefore fixed once a choice is made for them. For 
instance, we have 

momentum: [p] = M L 7’ - 1 , 

force: [K] = MLT~ 2 , 

energy = force x displacement: [£] = ML~T~ 2 , 

pressure = force/area: [ b ] — ML~ l T~ 2 . 

For example, one can choose to measure time in seconds, length in centimeters, 
and mass in grams. The unit of force is then 1 g cm s -2 = 1 dyn, the energy unit 
is lgcm 2 s -2 = 1 erg, etc. However, one should follow the International System 
of Units (SI), which was agreed on and fixed by law for use in the engineering 
sciences and for the purposes of daily life. In this system time is measured in 
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seconds, length in meters, and mass in kilograms, so that one obtains the following 
derived units: 

force : 1 kg m s -2 = 1 Newton (= 10 5 dyn) , 
energy : 1 kg m 2 s~ 2 = 1 Joule (— 10 7 erg) , 
pressure: 1 kgm -1 s -2 = 1 Pascal = 1 Newton/m 2 . 

If one identifies gravitational and inertial mass, one finds the following value 
for Newton’s gravitational constant from experiment: 

G = (6.67259 ± 0.00085) x 10" 11 m 3 kg“ 1 s“ 2 . 

For the Coulomb force the factor kc in (1.11) can be chosen to be 1. (This is 
the choice in the Gaussian system of electrodynamics.) With this choice electric 
charge is a derived quantity and has dimension 

[e] = M 1/2 L 3/2 7’“ 1 (k c = 1) • 

If instead one wishes to define a unit for charge on its own, or, equivalently, a unit 
for another electromagnetic quantity such as voltage or current, one must choose 
the constant kq accordingly. The SI unit of current is 1 ampere. This fixes the unit 
of charge, and the constant in (1.11) must then be chosen to be 

KC = Y~~ = C 2 X 10 -7 , 

An so 

where £q = 10 7 /4t rc 2 and c is the speed of light. 



1.6 Space, Time, and Forces 

At this point it may be useful to give a provisional summary of our discussion of 
Newton’s laws I— III. The first law shows the uniform rectilinear motion (1.5) to be 
the natural form of motion of every body that is not subject to any forces. If we 
send such a body from A to B it chooses the shortest connection between these 
points, a straight line. As one may talk in a physically meaningful manner only 
about motion relative to an observer. Law I raises the question in which frames of 
reference does the law actually hold. In fact, Law I defines the important class of 
inertial systems. Only with respect to these does Law II assume the simple form 
(1.8b). 

The space that supports the motions described by Newton’s equations is a three- 
dimensional Euclidean space, i.e. a real space where we are allowed to use the 
well-known Euclidean geometry. A priori this is an affine space. By choosing an 
origin we make it a real vector space, here R 3 . Important properties of the space 
of physical motions are its homogeneity (“it looks the same everywhere”) and its 
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isotropy (“all directions are equally good”). Time is one-dimensional; it is repre- 
sented by points on the real line. In particular, there is an ordering relation which 
classifies times into “earlier” and “later”, past and future. 

Combining the momentary position of a particle and the time at which it takes 
on that position, we obtain an event (x (f), t) e R 3 x K t , a point in the com- 
bined space-time continuum. This definition is particularly important for relativis- 
tic physics, which exhibits a deeper symmetry between space and time, as we shall 
see later. 

In comparing (1.2) and (1.8b) notice the asymmetry between the space and the 
time variables of a particle. Let r again be the proper time as in (1.1), and t the 
time measured by an observer. For the sake of simplicity we choose the same unit 
for both, i.e. we set a l,i) — 1. Equation (1.2) tells us that time runs uniformly and 
does not depend on the actual position of the particle nor on the forces which are 
applied to it. In contrast, the equation of motion (1.8) describes as a function of time 
the set of all possible trajectories that the particle can move on when it is subject 
to the given force field. Another way of expressing this asymmetry is this: rit) is 
the dynamical variable. Its temporal evolution is determined by the forces, i.e. by 
the dynamics. The time variable, on the other hand, plays the role of a parameter 
in nonrelativistic mechanics, somewhat like the length function in the description 
of a curve in space. This difference in the assignment of the variables’ roles is 
characteristic of the nonrelativistic description of systems of mass points. It does 
not hold for continuum mechanics or for any other field theory. It is modified also 
in physics obeying special relativity, where space and time hold more symmetric 
roles. 

Having clarified the notions in terms of which Newton’s laws are formulated, 
we now turn to an important application: the two-body problem with internal 
forces. 



1.7 The Two-Body System with Internal Forces 

1.7.1 Center-of-Mass and Relative Motion 

In terms of the coordinates r i, r 2 of the two particles whose masses are m i and 
m 2 , the equations of motion read 

m\fi = F 2 \ , m 2 r 2 — F \ 2 — -F 2 \ . (1.12) 

The force that particle “2” exerts on particle “1” is denoted by F 2 1 . We will adopt 
this notation throughout: F ki is the force field that is created by particle number 
k and is felt by particle number i. Taking the sum of these we obtain the equa- 
tion miri + m 2 r 2 = 0, which is valid at all times. We define the center-of-mass 
coordinates 
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Fig. 1.3. (a) Definition of center-of-mass and rel- 
ative coordinates for the two-body system. The 
relative coordinate r is independent of the choice 
of origin, (b) Force and reaction force in the two- 
body system. These gives rise to a central force 
in the equation of motion for the relative coor- 
dinate 



r s = (mm + ?n 2 r 2 ) ; (1.13) 

m i + w 2 

this means that rs = 0, i.e. the center of mass moves at a constant velocity. The 
dynamics proper is to be found in the relative motion. Define 



def 

r — r i — r 2 . 



0.14) 



By inverting (1.13) and (1.14) we have (see also Fig. 1.3) 
m 2 m\ 

n = rsH 1 — r, n — r s r . (1.15) 

nil + 771 2 7??i+?n 2 

Inserting these in (1.12), we find that the equation of motion becomes 

fir — F 2 1 . (1.16) 

The mass parameter 

def mi m 2 
M = : 

77? 1 + 777 2 

is called the reduced mass. By separating the center of mass we have reduced the 
two-body problem to the motion of one particle with mass fi. 
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1.7.2 Example: The Gravitational Force Between Two Celestial Bodies 
(Kepler’s Problem) 



In the case of the gravitational interaction (1.10), (1.12) becomes 

m\m 2 r i — rj m\m 2 r 

m\r\ = -G — j = -G — = , 

r- r r- r 

m\m 2 i" 2 ~ r i m\m .2 r 

»*2 r 2 = —G — « = +G — » , 

r- r r- r 



(1.17) 



where r = r i — r 2 and r — |r|, from which follow the equations of motion in 
center-of-mass and relative coordinates 



rs = 0 and fir — —G 



mim 2 r 



We can read off the behavior of the system from these equations: the center of mass 
moves uniformly along a straight line (or remains at rest). The relative motion is 
identical to the motion of a single, fictitious panicle of mass // under the action 
of the force 

m i m 2 r 



Since this is a central force, i.e. one which always points towards the origin or 
away from it, it can be derived from a potential U (r) = —A/r with A — Gm\m 2 . 
This can be seen as follows. 

Central forces have the general form F(r) = f(r)r , where r = r /r and f(r ) 
is a scalar function that should be (at least) continuous in the variable r = |r|. 
Define then 

U (r) - U(r 0 ) = - f f (r')dr' , 

Jr 0 



where ro is an arbitrary reference value and where U (ro) is a constant. If we take 
the gradient of this expression this constant does not contribute and we obtain 

Vf/(r) = d6 ^'7vr = — /(r)V J x 2 + y 2 + z 2 
dr v 

= ~f(r)r/r . 

Thus, F(r ) = — VC/(r). In the case of central forces the orbital angular momentum 



, def 

/ = fir x r 



is conserved; both its magnitude and its direction are constants in time. This follows 
from the observation that the acceleration is proportional to r : dl /d t — ptr x r = 0. 

As a consequence, the motion takes place entirely in a plane perpendicular to /, 
namely the one spanned by r° and v°. Since the motion is planar, it is convenient 
to introduce polar coordinates in that plane, viz. 
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x(t) — r(t) cos0(f) , y(f) = r(f) sin 0(f) , (1.18) 

so that the components of the angular momentum are 
l x = l y — 0 , l z = n>' 2 & = I — const , 
and, finally, 

0 = ///xr 2 . (1.19a) 

Furthermore the total energy, i.e. the sum of kinetic and potential energy, is con- 
served. In order to show this start from the equation of motion for a particle in 
the force field of a more general potential U (r) 

fj f — — Vt/(r) . 

This is an equation relating two vector fields, the acceleration multiplied by the 
reduced mass on the left-hand side, and the gradient field of the scalar function 
U (r) on the right-hand side. Take the scalar product of these vector fields with 
the velocity r to obtain the scalar equation 



Hr ■ r — —r ■ WU(r) . 



The left-hand side is the time derivative of (n/2 )r 2 . On the right-hand side, and 
with the decomposition r — {x, y, z], one has 



r ■ V U (r) = 



dxdU(r ) dydU(r) dz dU (r) 



dt dx df 3 y 



dt 3 z 



which is nothing but the total time derivative of the function U(r(t)) along smooth 
curves r(t) in K 3 . If these are solutions of the equation of motion, i.e. if they fulfill 
nr r — —r ■ WU(r), one obtains 



nr ■ r + r ■ V(/(r) = — ( -nr 1 + U(r) ) = 0 



df \2 

d E 1 , 

— = 0 , where E — ~nr~ + U(r ) . 
df 2 



hence 



Thus, even though in general E(r, r) is a function of the position r and the velocity 
r, it is constant when evaluated along any solution of the equation of motion. Later 
on we shall call this kind of time derivative, taken along a solution, the orbital 
derivative. 

For the problem that we are studying in this section this result implies that 



E — ^nv 2 + U(r) = + ,-2 0 2 ) + U (r) = const . (1-20) 



We can extract r as a function of r from (1.20) and (1.19): 



12 (E - U (r )) 






l 2 

n 2 r 2 



r — 



(1.19b) 
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Eqs. (1.19a) and (1.19b) form a system of two coupled ordinary differential equa- 
tions of first order. They were obtained from two conservation laws, the conserva- 
tion of the modulus of the angular momentum and the conservation of the total en- 
ergy of the relative motion. Although this coupled system is soluble, see Sect. 1.29 
and Practical Example 6 below, the procedure is somewhat cumbersome. It is sim- 
pler to work out a parametric form of the solutions by obtaining the radial variable 
as a function of the azimuth, r — r((p) (thereby losing information on the evolution 
of r(t) as a function of time, though). 

By “dividing” (1.19b) by (1.19a) and making use of dr /dtp = ( dr / dt) / {d(p / dt ), 
we find that 



1 dr _ 1 2 p(E - U (r)) T 

r 2 d <p V / 2 r 2 

This differential equation is of a type that can always be integrated. This means that 
its solution is reducible to ordinary integrations. It belongs to the class of ordinary 
differential equations with separable variables, cf. Sect. 1.22 below, for which 
general methods of solution exist. In the present example, where U (r) — —A/r, 
there is a trick that allows to obtain solutions directly, without doing any integrals. 
It goes as follows. 

Setting U ir) — — A/r and replacing r(cp) by the function o(tp ) = l/r(</>), we 
obtain the differential equation 



der 2p(E + Ao) 7 

# = V r- CT “ 



where we have made use of do /dcp — —r 2 dr/d(p. 

It is convenient to define the following constants: 



P 





2 El 2 
pA 2 



The parameter p has the dimension of length, while e is dimensionless. There 
follows 




an equation that is solved by substituting a — \/p = (e/p) cos <p. Rewritten in 
terms of the original variable r(tp) the general solution of the Kepler problem is 



r((p) = 



P 

1 + e cos cp 



( 1 . 21 ) 



Before proceeding to analyze these solutions we remark that (1.19a) is a con- 
sequence of the conservation of angular momentum and is therefore valid for any 
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central force. The quantity r 2 <p / 2 is the surface velocity at which the radius vector 
moves over the plane of motion. Indeed, if r changes by the amount dr, the radius 
sweeps out the area d F = |r x dr|/2. Thus, per unit of time, 



d F 

dr 



1 

-I r x r\ = 
2 1 



/ 

2/x 




= const . 



This is the content of Kepler’s second law (1609): 



( 1 . 22 ) 



The radius vector from the sun to the planets sweeps out equal areas in 
equal times. 



We note under which conditions this statement holds true: it applies to any 
central force but only in the two-body problem; for the motion of a planet it is 
valid to the extent the interaction with the other planets is negligible compared to 
the action of the sun. 

In studying the explicit form of the solutions (1.21) it is useful to introduce 
Cartesian coordinates ( x , y) in the plane of the orbit. Equation (1.21) is then turned 
into a quadratic form in x and y, and the nature of the Kepler orbits is made more 
evident: they are conics. One sets 

x — r cos (f> + c , v = r sin </> , 



and chooses the constant c so that in the equation 

r 2 = (x — c) 2 + y = [ p — sr cos (p] 2 = [p — e(x — c)] 2 
the terms linear in x cancel. As long as s / 1, this is achieved by the choice 




Finally, with the definition 



a 



def 




(1.21) becomes 




(1.210 



i.e. an equation of second order containing only the squares of x and y. Here two 
distinct cases are possible. 

(i) £ > 1, i.e. c 2 > a 2 . In this case (1.2 T) describes a hyperbola. The center 
of the force field lies at one of the foci. For the attractive case (A and p are 
positive) the branch of the hyperbola that opens toward the force center is the 
physical one. This applies to the case of gravitational interaction (cf. Fig. 1.4). 
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Fig. 1.4. If in the Kepler problem the energy E (1.20) is 
positive, the orbits of relative motion are branches of hy- 
perbolas. The figure shows the relevant branch in the case 
of an attractive force 



These concave branches describe the orbits of meteorites whose total energy is 
positive. Physically speaking, this means that they have enough kinetic energy to 
escape from the attractive gravitational field to infinity. 

The branch turning away from the force center is the relevant one when the 
force is repulsive, i.e. if A and p are negative. This situation occurs in the scattering 
of two electric point charges with equal signs. 

(ii) s < 1, i.e. c < a. In this case the energy E is negative. This implies that 
the particle cannot escape from the force field: its orbits must be finite everywhere. 
Indeed, (1.21') now describes an ellipse (cf. Fig. 1.5) with 

. . . P A 

semimajor axis a = ^ = ^ — — , 

semiminor axis b = \/ a 2 — c 2 — ~J~pa — — . 

V2/r(-£') 

The orbit is a finite orbit. It is closed and therefore periodic. This is Kepler’s first 
law: the planets move on ellipses with the sun at one focus. This law holds true 

only for the gravitational interaction of two bodies. All finite orbits are closed and 

are ellipses (or circles). In Sect. 1.24 we return to this question and illustrate it 
with a few examples for interactions close to, but different from, the gravitational 




Fig. 1.5. If the energy (1.20) is negative, the orbit is an 
ellipse. The system is bound and cannot escape to infinity 
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case. The area of the orbital ellipse is F — nab — na^fap. If T denotes the 
period of revolution (the time of one complete circuit of the planet in its orbit), 
the area law (1.22) says that F = Tl/2p. A consequence of this is Kepler’s third 
law (1615), which relates the third power of the semimajor axis to the square of 
the period, viz. 



a 2 A G(mi+m 2 ) 

— =■ = = const = 

T 2 (2n) 2 p (2n) 2 



(1.23) 



If one neglects the mutual interactions of the planets compared to their interaction 
with the sun and if their masses are small in comparison with the solar mass, we 
obtain: 



For all planets of a given planetary system the ratio of the cubes of the 
semimajor axes to the squares of the periods is the same. 



Of course, the special case of circular orbits is contained in (1.21'). It occurs 
when e = 0, i.e. when E — —pA 2 /2l 2 , in which case the radius of the orbit has 
the constant value a — l 2 / p, A. 

The case £ = 1 is a special case which we have so far excluded. Like the cir- 
cular orbit it is a singular case. The energy is exactly zero, E — 0. This means that 
the particle escapes to infinity but reaches infinity with vanishing kinetic energy. 
The orbit is given by 



y 2 + 2 px - 2 pc - p 2 = 0 , 



where c may be chosen at will, e.g. c — 0. The orbit is a parabola. 

So far we have studied the relative motion of two celestial bodies. It remains 
to transcribe this motion back to the true coordinates by means of (1.15). As an 
example we show this for the finite orbits (ii). Choosing the center of mass as the 
origin, one has 



Si 



m 2 

r , 

in i + mi 



s 2 = - 



m i 

r . 

m\ + m2 



The celestial bodies 1 and 2 move along ellipses that are geometrically similar to 
the one along which the relative coordinate moves. They are reduced by the scale 
factors mi/ (m i + m 2 ) and m\/(mi + m 2 ), respectively. The center of mass S is 
a common focus of these ellipses: 



S l(0) = 



m2 



P 



mi + m 2 1 + £ cos <p 



S 2 (4>) = 



m 1 



P 



mi + m 2 1 + £ cos (f> 



(Si = |S,|). 



Figure 1.6a shows the case of equal masses; Fig. 1.6b shows the case m 1 m 2 . 
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Fig. 1.6. Upon transformation of the rela- 
tive motion of Fig. 1 .5 to the real motion of 
the two celestial bodies, both move on el- 
lipses about the center of mass S, which is 
at one of the foci, (a) shows the situation for 
equal masses mj = m 2 ; (b) shows the case 
m 2 m\. Cf. Practical Example 1.1 



1.7.3 Center-of-Mass and Relative Momentum in the Two-Body System 



As we have seen, the equations of motion can be separated in center-of-mass and 
relative coordinates. Similarly, the sum of the momenta and the sum of the angular 
momenta can be split into parts pertaining to the center-of-mass motion and parts 
pertaining to the relative motion. In particular, the total kinetic energy is equal 
to the sum of the kinetic energies contained in the center-of-mass and relative 
motions, respectively. These facts are important in formulating conservation laws. 

Let P be the momentum of the center of mass and p the momentum of the 
relative motion. We then have, in more detail, 

P = f [mi + m 2 )r s = m \ r i + m 2 r 2 — Pi + P 2 
def 1 / \ 

P = pr = (m 2 p 1 - mip 2 ) , 

m 1 + m 2 

or, by inverting these equations, 



Pi = P + 



mi 



m\ + m 2 
The total kinetic energy is 

p\ p\ 

Ti + T 2 = + -p- 

2m 1 2m 2 



p ; pi = -p + 



m 2 

m 1 + m 2 



El 

2fi 



2(777 1 + m 2 ) 



(1.24) 



Thus, the kinetic energy can be written as the sum of the kinetic energy of relative 
motion, p 2 /2fx, and the kinetic energy of the center of mass, P 1 /2(ni \ +m 2 ). We 
note that there are no mixed terms in p and P . 
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In a similar way we analyze the sum L of the angular momenta l \ — m\r \ xr \ 
and h — m 2 r 2 x r 2 . One finds that 



L = l[ + l 2 — r s x r s (m i + m 2 )+r x r 



m i - 



(m i + m 2) 2 
— (m\ + m 2 )rs x r s + fir x r = l s + l re 1 • 



■ m 2 



(m 1 + m 2) 2 
(1.25) 



The total angular momentum splits into the sum of the angular momentum / s rel- 
ative to the origin O (which can be chosen arbitrarily) and the angular momentum 
of the relative motion Z re i. The first of these, Is, depends on the choice of the refer- 
ence system; the second, / re i, does not. Therefore, the relative angular momentum 
is the relevant dynamical quantity. 



1.8 Systems of Finitely Many Particles 



We consider n mass points (mi, m 2 , . . . , m n ), subject to the internal forces Fj k 
(acting between i and k) and to the external forces Kj. We assume that the internal 
forces are central forces, i.e. that they have the form 



F ik = F ik (r ik ) ^ (r ik = \r, - r*|) , (1.26) 

fik 

where Fj k (r) — F k j(r) is a scalar and continuous function of the distance r. (In 
Sect. 1.15 we shall deal with a somewhat more general case.) Central forces possess 
potentials 

U ik (r) = - f F ik (r')dr' , (1.27) 

Jr 0 

and we have F, k = — V k Uj k (r), where 



r = y (*(') — x^y + (y(') — y^) + — z®) - t 

and the gradient is given by 



V* = 



d d d 

’ 3 ykF) 3^W 

(Remember that F, k is the force that i exerts 



on k.) The equations of motion read 



mi'fi = F 21 + F 31 + . . . + F n \ + K 1 , 
m 2 r 2 = F 12 + F 22 + ■ • ■ + F n2 + K 2 , 



m„r n = F\ n + F 2n +■..+ F n -\ n + K n , or (1.28) 

n 

mifi = ^2 Fki + K ' ’ with Fki = ~ F ' k ■ 
k^i 

With these assumptions one proves the following assertions. 




1.10 The Principle of Angular-Momentum Conservation 
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1.9 The Principle of Center-of-Mass Motion 



The center of mass S of the n -particle system behaves like a single particle 
of mass M — YTi = l m i acted upon by the resultant of the external forces: 

n i n 

Mr s = 'y" Kj , where rs == — ^2 m,r,- . (1-29) 

1 = 1 i= 1 



This principle is proved by summing the equations (1.28) over all particles. 
The internal forces cancel in pairs because F ki = — F,k, from Newton’s third 
law. 



1.10 The Principle of Angular-Momentum Conservation 



The time derivative of the total angular momentum equals the sum of all 
external torques: 




n 



J2 r J x k j ■ 

7 = 1 



(1.30) 



Proof. For a fixed particle index i 



rniri x Y[ 



J2 F ‘k( r ik)~ 

k^i 



x (r k - n) 
rik 



+ r, x K, . 



The left-hand side is equal to 



d , . v d 

m, — (r,- x r ( ) = —F, 



dr 



Taking the sum over all i yields the result (1.30). The internal forces cancel pair- 
wise because the cross product is antisymmetric while is symmetric. □ 
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1.11 The Principle of Energy Conservation 



The time derivative of the total internal energy is equal to the total power 
(work per unit time) of the external forces, viz. 

d " 

— (T + U ) = ■ Kj) , where 

i — I 



t = = J2 Ti and 

Z i = 1 

n n 

U = E E Uik(m ) = u(r 1, . . . , r„) . 

i=l &=/+! 



(1.31) 



Proof. For fixed i one has 



m i r i — -V, ^2 Uik(r ik ) + . 

k^i 

Taking the scalar product of this equation with r , yields 

mfri ' h = l - = -r, • V; ^ Uik(n k ) + r, 

k^i 



Ki . 



Now we take the sum over all particles 



ft fe W ' U ‘k{nk) + Ki 

\ i / i = 1 *=1 i = l 

k^i 



and isolate the terms i = a, k — b and i — b. k = a, with h > a. of the double 
sum on the right-hand side. Their sum is 

r -i d 

r a ■ V a Uab + r h ■ VbUba = [r a • V„ + r h • V fo J U a b = — Uab , 
because U a b = Ub a . From this it follows that 



d 

dr 



+ J2 u <k{nk) 



= £';•*;■ D 
7=1 



i = 1 i=l k=i + 1 

We consider next an important special case: the closed n-particle system. 





1.12 The Closed H-Particle System 
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1.12 The Closed n -Particle System 

A system is said to be closed if all external forces vanish. Proposition 1 .9 reduces 
now to 

n 

Mrs — 0 , or Mrs = = P — const , and 

1 = 1 

1 n 
rs(t ) = — Pt + r s(0) with P = y^ p t = const . 

(=1 

This is the principle of conservation of momentum: the total momentum of a closed 
system is conserved. 

Proposition 1.10 reads 

n n 

T, ri x pj = y lj = L — const . 

1=1 i= 1 

The total angular momentum is also an integral of the motion. 

Proposition 1.11 finally becomes 

n 2 

T + U — y + y Uikifik) = E = const . 

1=1 ' k>i 

In summary, the closed n -particle system is characterized by 10 integrals or con- 
stants of the motion, viz. 



P, the total momentum; 


P — const 


Momentum Conservation 


o 

■- 

II 

£ 

-is 

i 

CO 




Center-of-Mass Principle 


E = T +u = £ + 


Tre 1 + U 


Conservation of Energy 


X 

II 

w 

II 


P + ^rel 


Consen’ation of 
Angular Momentum 



The quantities {rs(0), P,L,E } form the ten classical constants of the motion of 
the closed 7? -particle system. 

This remarkable result calls for questions and some comments: 

(i) Perhaps the most obvious question is whether the existence of ten integrals of 
the motion guarantees integrability of the equations of motion, and if so, for which 
number n of particles it does so. The answer may seem surprising at this point: 
a closed two particle system whith central forces is indeed integrable, the general 
closed three particle system is not. In other terms, while the constants of the motion 
guarantee integrability for n — 2, this is not true for n > 3. The reason for this 
observation is that, in addition to be conserved, the integrals of the motion must 
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fulfill certain conditions of compatibility. We shall come back to this question in 
Sect. 2.37. 

(ii) Why are there just ten such integrals? The answer to this question touches 
upon a profound relationship between invariance of a physical system under space- 
time coordinate transformations and conservation laws. It turns out that the most 
general affine transformation that relates one inertial system to another depends on 
the same number ten of real parameters. This is what is worked in the next section, 
here and in Chap. 2 it will become clear that there is a one-to-one correspondence 
between these parameters and the ten integrals of the motion. 



1.13 Galilei Transformations 



It is not difficult to verify that the most general affine transformation g that maps 
inertial frames onto inertial frames must have the following form: 



r h* r' — Rr + wt + a with R e 0(3), det R = +1 or -1, 

g 

t (->■ t' — Xt + s with A, = + 1 or — 1 . 

g 



(1.32) 



Here R is a rotation, w a constant velocity vector, a a constant vector of dimension 
length. We analyze this transformation by splitting it into several steps, as follows. 

1. A shift of the origin by the constant vector a: 



r' = r + a . 



2. Uniform motion of K' relative to K, with constant velocity, such that K and 
K' coincide at time t = 0: 

r' — r + wt . 

3. A rotation whereby the system K' is rotated away from K in such a way 
that their origins are the same, as shown in Fig. 1.7, r' — Rr. Let 

r = (x = n, y = r 2 , z = n) r' = (V = r[, y' = r' 2 , z' = r' 3 ) . 




Fig. 1.7. Two Cartesian coordinate systems that are con- 
nected by a rotation about the direction h by an angle ip 





1.13 Galilei Transformations 
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When written in components, r' — Rr is equivalent to 

3 

r'i = 22 R >k r k , i — 1, 2, 3 . 
k= l 

We must have r' 2 — r 2 (this is the defining condition for the rotation group), i.e. 

3 3 3 3 3 

22 r 'i r 'i = 'y*'y,'y, RaRunn = 22 rkn ’ and thus 

tiwi=1 * =1 (1.33) 

3 3 

22 Rik R n — hi , or ^(R r k-,- Rn — Ski . 



;=t 



i=t 



R is a real orthogonal 3x3 matrix. Equation (1.33) implies (detR) 2 = 1, i.e. 
detR = +1 or —1. Equation (1.33) yields 6 conditions for the 9 matrix elements 
of R. Therefore R depends on 3 free parameters, for example a direction h about 
which K' is rotated with respect to K and which is given by its polar angles (0, <p) 
and the angle cp by which K must be rotated in order to rearch K' (see Fig. 1.7). 
4. A shift of the time origin by the fixed amount s: 

T = t “f S . 

Collecting all steps we see that the general transformation 



r' = Rr + wt + a 
t r = Xt A s 



(1.34) 



with, initially, det R = +1 and X = +1, depends on 10 real parameters, viz. 



g = g(<p,n, w, a, s) . 

R 

There are as many parameters in the Galilei transformation as there are constants 
of the motion in the closed n -particle system. The transformations g form a group, 
the proper, orthochronous Galilei group G^ 4 . In order to show this, we consider 
first the composition of two subsequent transformations of this kind. We have 

n — R (l) ro + w {i) to + « (1) ; t\ = to + , 

r 2 = R^r 1 + w^t l+ a^ ; t 2 = h+s^. 

Writing the transformation from r o to r 2 in the same way, 
r 2 — R <3) J"o + w (3) to + a (2) , f 3 = to + , 



we read off the following relations 



4 The arrow pointing “upwards” stands for the choice A = +1; that is, the time direction remains 
unchanged. The plus sign stands for the choice det R = + 1 . 




26 



1. Elementary Newtonian Mechanics 



r(3) _ r(2) _ r(1) 

»(3) = + W® , 

a ( 3) — R (2 )a a) + s (1) w (2) + « (2) , 

5 ( 3 ) = j ( 2 ) + j ( 1 ). 

One now shows explicitly that these transformations do form a group by ver- 
ifying that they satisfy the group axioms : 



1. There is an operation defining the composition of two Galilei transforms: 



g(R ( 2 \u/ 2 ) ,« ( 2 ) ,s ( 2 ) )g(R (1 \ 



w 



(1) „(1) e (l)\ _ 



, a 



s W)=g(RQ\wV\aQ\sW) . 



This is precisely what we verified in (1.35). 

2. This composition is an associative operation: g 3 (g 2 gi) = (g3g2)gi- This is so 
because both addition and matrix multiplication have this property. 

3. There exists a unit element, E — # ( II , 0, 0, 0), with the property gi E — 
Egi = gj for all gi e g|. 

4. For every g e G|_ there is an inverse transformation g~ 1 such that g-g~ l — E. 
This is seen as follows. Let g — g ( R . w, a, s). From (1.35) one sees that 
g~ l — g(R T , — R t uj, sR 1 w — R ] a, —s). 



It will become clear later on that there is a deeper connection between the 
ten parameters of the proper, orthochronous Galilei group and the constants of 
the motion of the closed n -particle system of Sect. 1.12 and that it is therefore no 
accident that there are exactly ten such integrals. We shall learn that the invariance 
of a mechanical system under 

(i) time translations t t' = 1 + s implies the conservation of total energy E 
of the system; 

(ii) space translations r r' = r + a implies conservation of total momentum 
P of the system (the components of a correspond to the components of P in 
the sense that if the system is invariant only under translations along a fixed 
direction, then only the projection of P onto that direction is conserved); 

(iii) rotations r \—^ r' — R(<p)r about a fixed direction implies the conservation 
of the projection of the total angular L onto that direction. 



The assertions (i-iii) are the content of a theorem by Emmy Noether, which 
will be proved and discussed in Sect. 2.19. 

Finally, one easily convinces oneself that in the center-of-mass motion the 
quantity 

P 

r s(0) = r s (t) - —t 
M 

stays invariant under the transformations r r’ = r + wt. 

We conclude this section by considering the choices det R = — 1 and/or X = — 1 
that we have so far excluded. In the Galilei transformation (1.34) the choice X = — 1 




1.14 Space and Time with Galilei Invariance 
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corresponds to a reflection of the time direction, or time reversal. Whether or not 
physical phenomena are invariant under this transformation is a question whose 
importance goes far beyond mechanics. One easily confirms that all examples con- 
sidered until now are indeed invariant. This is so because the equations of motion 
contain only the acceleration r, which is invariant by itself, and functions of r : 

r + f(r) = 0 . 

By t i-^ — t the velocity changes sign, r —r. Therefore, the momentum p and 
also the angular momentum / change sign. The effect of time reversal is equivalent 
to reversal of motion. All physical orbits can be run over in either direction, forward 
or backward. 

There are examples of physical systems, however, that are not invariant under 
time reversal. These are systems which contain frictional forces proportional to 
the velocity and whose equations of motion have the form 

r + Kr + f(r) — 0 . 

With time reversal the damping caused by the second term in this equation would 
be changed to an amplification of the motion, i.e. to a different physical process. 

The choice det R = — 1 means that the rotation R contains a space reflection. 
Indeed, every R with det R = — 1 can be written as the product of space reflection 
(or parity) P : 




and a rotation matrix R with det R = +l, R = P R. P turns a coordinate system 
with right-handed orientation into one with left-handed orientation. 



1.14 Space and Time with Galilei Invariance 

(i) The invariance of mechanical laws under translations (a) is a manifestation of 
the homogeneity of the physical, three-dimensional space; invariance under rota- 
tions (R) is an expression of its isotropy. Here we wish to discuss these relations 
a little further. Imagine that we observe the motion of the sun and its planets from 
an inertial frame Ko. In that frame we establish the equations of motion and, by 
solving them, obtain the orbits as a function of time. Another observer who uses a 
frame K that is shifted and rotated compared to Ko will describe the same plane- 
tary system by means of the same equations of motion. The explicit solutions will 
look different in his system, though, because he sees the same physics taking place 
at a different point in space and with a different spatial orientation. However, the 
equations of motion that the system obeys, i.e. the basic differential equations, are 
the same in either frame. Of course, the observer in K may also choose his time 
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zero differently from the one in Ko, without changing anything in the physics that 
takes place. It is in this sense that space and time are homogeneous and space, in 
addition, is isotropic. Finally, it is also admissible to let the two systems K and 
K 0 move with constant velocity w relative to each other. The equations of motion 
depend only on differences of coordinate vectors (x (l 1 — x lk> ) and therefore do not 
change. In other words, physical motion is always relative motion. 

So far we have used the passive interpretation of Galilei transformations: the 
physical system (the sun and its planets) are given and we observe it from different 
inertial frames. Of course, one can also choose the active interpretation, that is, 
choose a fixed inertial system and ask the question whether the laws of planetary 
motion are the same, independent of where the motion takes place, of how the 
orbits are oriented in space, and of whether the center of mass is at rest with 
respect to the observer or moves at a constant velocity w. 

[Another way of expressing the passive interpretation is this: an observer lo- 
cated at a point A of the universe will abstract the same fundamental laws from the 
motion of celestial bodies as another observer who is located at a point B of the 
universe. For the active interpretation, on the other hand, one would ask a physicist 
at B to carry out the same experiments as a physicist whose laboratory is based 
at A. If they obtain the same results and reach the same conclusions, under the 
conditions on the relative position (or motion) of their reference frames defined 
above, physics is Galilei invariant.] 

(ii) Suppose we consider two physically connected events (a) and ( b ), the first 
of which takes place at position x ia) at time t“ , while the second takes place at 
position x ib> at time t h . For example, we throw a stone in the gravitational field 
of the earth such that at t a it departs from x Ui) with a certain initial velocity and 
arrives at at time t h . We parametrize the orbit x that connects and 
and likewise the time variable by 

x — x(r) with x ia) = x(r a ) , x (b) = x(r^) , 
t = t( t) with t a = , t b = f(tft) , 



where r is a scalar parameter (the proper time). The time that a comoving clock 
will show has no preferred zero. Furthermore, it can be measured in arbitrary units. 
The most general relation between t and r is then t( r) = at + ft with a and /I 
real constants. Expressed in the form of a differential equation this means that 
d 2 r/dr 2 = 0. Similarly, the orbit x(r) obeys the differential equation 



d 2 x 

dT 2 



+ f(r) 




= 0 , 



with dr/dr = a and where / is minus the force divided by the mass. The com- 
parison of these differential equations shows the asymmetry between space and 
time that we noted earlier. Under Galilei transformations, t(r) = at + /3 becomes 
t'(r) — ax + /3 + s; that is, time differences such as (t a — t b ) remain unchanged. 
Time t( r) runs linearly in r, independently of the inertial frame one has chosen. 




1.15 Conservative Force Fields 
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In this sense the time variable of nonrelativistic mechanics has an absolute char- 
acter. No such statement applies to the spatial coordinates, as will be clear from 
the following reasoning. 

We follow the same physical motion as above, from two different inertial 
frames K (coordinates x . t) and K' (coordinates x' . t'). If they have the same ori- 
entation, they are related by a Galilei transformation g e G_J_, so that 

t' a - f b = t a -t b , 

(*'(«) _ x '(b)f = ^R( x («) _ X W) + w (f a - t b )j 

= ((x {a) -x (b) ) + R ~ { w(t a -t b )j . 

(The last equation follows because the vectors z and Rz have the same length.) 
In particular, the transformation law for the velocities is 

v' = R(u + R~'uj) and v 2 — (V — w ) 2 . 

In observing the same physical process and measuring the distance between points 
(a) and (/;), observers in K and K' reach different conclusions. Thus, unlike the 
time axis, orbital space does not have a universal character. 

The reason for the difference in the results obtained in measuring a distance 
is easy to understand: the two systems move relative to one another with constant 
velocity w . From the last equation we see that the velocities at corresponding space 
points differ. In particular, the initial velocities at point (a), i.e. the initial condi- 
tions, are not the same. Therefore, calculating the distance between (a) and (b) 
from the observed velocity and the time difference gives different answers in K 
and in K'. (On the other hand, if we chose the initial velocities in (a) to be the 
same with respect to K and to K'. we would indeed find the same distance. How- 
ever, these would be two different processes.) The main conclusion is that, while 
it is meaningful to talk about the spatial distance of two events taking place at the 
same time, it is not meaningful to compare distances of events taking place at dif- 
ferent times. Such distances depend on the inertial frame one is using. In Sect. 4.7 
we shall establish the geometrical structure of space-time that follows from these 
considerations. 



1.15 Conservative Force Fields 

In our discussion of the «-particle system (Sects. 1.8-1.12) we had assumed the 
internal forces to be central forces and hence to be potential forces. Here we wish 
to discuss the somewhat more general case of conservative forces. 

Conservative forces are defined as follows. Any force field that can be rep- 
resented as the (negative) gradient field of a time-independent, potential energy 
U (r), 



F = — V U (r) , 
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is called conservative. This definition is equivalent to the statement that the work 
done by such forces along a path from r o to r depends only on the starting point 
and on the end point but is independent of the shape of that path. More precisely, 
a force field is conservative precisely when the path integral 

f (F ■ ds) — —(U(r) — U (r 0 )) 

Jr o 

depends on r and r o only. As already indicated, the integral can then be expressed 
as the difference of the potential energies in r and r q. In particular, the balance of 
the work done or gained along a closed path is zero if the force is conservative, 
viz. 



(F ds) = 0 



for any closed path r. 

What are the conditions for a force field to be conservative, i.e. to be derivable 
from a potential? If there is a potential U (which must be at least C 2 ), the equal- 
ity of the mixed second derivatives d 2 U/dydx — d 2 U/dxdy (cyclic in x, y, z) 
implies the relations 



dF y 

dx 



d F x 

= 0 (plus cyclic permutations) . 

9y 



Thus the curl of F(r), i.e. 



(d F z 


dFy 


d F x 


3 Fz 


dFy 


3 FA 


V 3y 


dz 


dz 


dx 


dx 


dy ) 



must vanish. This is a necessary condition, which is sufficient only if the domain 
over which the function U (r) is defined and where curl F vanishes is singly con- 
nected. Singly connected means that every closed path that lies entirely in the do- 
main can be contracted to a point without ever meeting points that do not belong 
to the domain. Let r be a smooth, closed path, let S be the surface enclosed by 
it, and let h be the local normal to this surface. Stokes’ theorem of vector analysis 
then states that the work done by the force F along the path r equals the surface 
integral over S of the normal component of its curl: 

£(F-ds) = JJ d/(curl F) ■ h . 

This formula shows the relationship between the condition curl F = 0 and the 
definition of a conservative force field: the integral on the left-hand side vanishes, 
for all closed paths, only if curl F vanishes everywhere. 

We consider two examples, for the sake of illustration. 
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Example (i) A central force has vanishing curl, since 



d / ( dr 



/ \ \x j vi vi 

(curl f{r)r) = — — z - — y 



dr 



dr \9;y 

d / 1 

= -7- -iyz 

d r r 



3 z' 

■ zy) = 0 



(plus cyclic permutations) . 



Example (ii) The curl of the following force field does not vanish everywhere: 

F x = -B 4 , Fy — +#— 1 ’ ^= 0 , 

<? 2 ' 6 2 

where 



£ = x 2 + y 2 and B — const . 



(This is the magnetic field around a straight, conducting wire.) It vanishes only 
outside the z-axis, i.e. in R 3 from which the z-axis ( x — 0, y — 0) has been cut 
out. Indeed, as long as (x, y) ^ (0, 0) we have 



(curl F) x = (curl F) y — 0 , 

/I 2x 2 1 

(curl F) z = B I — + — - 

\Q e 4 Q 2 




= 0 . 



For x = y = q = 0, however, the z-component does not vanish. An equivalent 
statement is that the closed integral f(F ■ d.v) vanishes for all paths that do not 
enclose the z-axis. For a path that winds around the z-axis once one finds that 



(F ■ ds) = 2jtB 



This is shown as follows. Choose a circle of radius R around the origin that lies in 
the (x, y)-plane. Any other path that winds around the z-axis once can be deformed 
continuously to this circle without changing the value of the integral. Choose then 
cylindrical coordinates (x = q cos cf>, y = q sin (j), z). Then F = ( B / Q)e ( p and 
ds = q d where = —e x sin0 + e v cos <p, and f(F-ds) = B J ^ d(p — 2 jt B. 
A path winding around the z-axis n times would give the result 2i rnB. 

Yet, in this example one can define a potential, viz. 

U (r) = —B arctan (y/x) = — B(f> . 

This function is unique over any partial domain of R 3 that avoids the z-axis. How- 
ever, as soon as the domain contains the z-axis this function ceases to be unique, in 
spite of the fact that curl F vanishes everywhere outside the z-axis. Clearly, such 
a domain is no longer singly connected. 
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1.16 One-Dimensional Motion of a Point Particle 

Let q be the coordinate, p the corresponding momentum, and F(q) the force. We 
then have 

q = — P ; p= F(q ) . (1.36) 

m 

This is another way of writing the equation of motion. The first equation repeats the 
definition of the momentum, F(q ) on the right-hand side of the second equation 
is the force field (in one dimension). 

The kinetic energy is T — mq 1 / 2 = p 2 /2m. The function F(q) is assumed 
to be continuous. In one dimension there is always a potential energy U(q) = 
— f‘j () F(q')dq' such that F(q) — —dU(q)/dq. The total energy E — T + U 
is conserved: dE/dt — d (T + U)/dt = 0, the time derivative being taken along 
solutions of the equation of motion. Take as an example the harmonic force F(q) = 
—up, with k a real positive constant, i.e. a force that is linear in the coordinate 
q and tends to drive the system back to the equilibrium position qo — 0 (Hooke’s 
law), 

1 

q = —p , p = -xq ■ 

m 

Consider a particular solution, for instance the one that starts at (q — —a, p — 0) 
at time t — 0, 

q(t) — —a cos {Jlcjm t ) , pit) = ci-s/mK sin k / m t ) . 

The spatial motion which is actually seen by an observer is the oscillatory function 
q(t) — —a cos i^ficjmt) in coordinate space. Although this is a simple function of 
time, it would need many words to describe the temporal evolution of the particle’s 
trajectory to a third party. Such a description could go as follows: “The particle 
starts at q — — a , where its kinetic energy is zero, its potential energy is maximal 
and equal to the total energy U(q = a) — E = (1/2 )kci 2 . It accelerates, as it 
is driven to the origin, from initial momentum zero to p — a^/niK at the time it 
passes the origin. At this moment its potential energy is zero, its kinetic energy 
'/kin = E is maximal. Beyond that point the particle is slowed down until it reaches 
its maximal position q — a where its momentum vanishes again. After that time 
the momentum changes sign, increases in magnitude until the particle passes the 
origin, then decreases until the particle reaches its initial position. From then on 
the motion repeats periodically, the period being T = 2jt *Jm / .” 

The physics of the particle’s motion becomes much simpler to describe if one 
is ready to accept a small step of abstraction: Instead of studying the coordinate 
function q(t) in its one-dimensional manifold R alone, imagine a two-dimensional 
space with abscissa q and ordinate p, 

{R 2 , with coordinates ( q , /?)} , 
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and draw the solutions (q(t). pit)) as curves in that space, parametrized by time t. 
In the example we have chosen these are periodic motions, hence closed curves. 
Now, the actual physical motion is obvious and simple: In the two-dimensional 
space spanned by q and p the particle moves on a closed curve, reflecting the 
alternating behaviour of position and momentum, of kinetic and potential energies. 

These elementary considerations and the example we have given may be helpful 
in motivating the following definitions. 

We introduce a compact notation for the equations (1.36) by means of the 
following definitions. With 

[ def def i _ def 1 def _ . ,1 

x — \x\=q, X 2 = p ; T — — — p, T 2 = F(q)\ . 

[ m ) 

(1.36) reads 

x = f{x,t). (1.37) 

The solutions x\{t) — cp{t) and x 2 (t) = imp(t) of this differential equation are 
called phase portraits. The energy function E(q, p) — E(<p(t), < pit)), when taken 
along the phase curves, is constant. 

The x are points of a phase space P whose dimension is dimP = 2. One 
should note that the abscissa q and the ordinate p are independent variables that 
span the phase space, p is a function of q only along solution curves of (1.36) or 

(1.37) . The physical motion “flows” across the phase space. To illustrate this new 
picture of mechanical processes we consider two more examples. 

1.17 Examples of Motion in One Dimension 

1.17.1 The Harmonic Oscillator 

The harmonic oscillator is defined by its force law F(q) = —murq. The applied 
force is proportional to the elongation and is directed so that it always drives the 
particle back to the origin. The potential energy is then 

U{q) = \ma> 2 (q 1 -ql), (1.38) 

where qo can be chosen to be zero, without loss of generality. One has 
x = fix) with x\ — q , x 2 — p , and 

1 1 , 

T\ = — p — — x 2 , f 2 — F{q) — —mcox i , 
m m 

so that the equations of motion (1.37) read explicitly 

1 • 2 

x\ = — x 2 , x 2 — —mw x\ . 
m 
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The total energy is conserved and has the form 

Xry 1 9 9 

E — -=- H — mco xt = const . 

2m 2 1 

One can hide the constants m and a> by redefining the space and time variables 
as follows: 

, . def / — , . def 1 . . def 

z i (r) = (Oy/mxi(t) , Z2\i) — — -=X 2 \t) , r — a>t . 

■Jm 



This transformation makes the energy a simple quadratic form, 

£ = i[z? + 4]. 

while time is measured in units of the inverse circular frequency of 1 = T /2tc . 
One obtains the system of equations 



dziCO 

dr 



= Z2(r) , 



dz 2 (r) 

dr 






(1.39) 



It is not difficult to guess their solution for the initial conditions z,i(t = 0) = z,, 
z 2 (r = 0) = Zj. It is 

zi(r) = y (zf)~ + ^ 2 ) C0S ( T - <P) - 

Z2(t) = ~y + ^ 2 ) sin ( r _ ¥>) ’ where 

sin <p = Z 2 / 1 J (z?)" + (z°) - cos <p = z\/ 

The motion corresponding to a fixed value of the energy E becomes particularly 
clear if followed in phase space (z .\ , z 2 )- The solution curves in phase space are 
called phase portraits. In our example they are circles of radius ~J2E, on which the 
system moves clockwise. The example is completely symmetric in coordinate and 
momentum variables. Figure 1.8 shows in its upper part the potential as a function 
of zi as well as two typical values of the energy. In the lower part it shows the 
phase portraits corresponding to these energies. 

Note what we have gained in describing the motion in phase space rather than 
in coordinate space only. True, the coordinate space of the harmonic oscillator is 
directly “visible”. However, if we try to describe the temporal evolution of a spe- 
cific solution q(t) in any detail (i.e. the swinging back and forth, with alternating 
accelerations and decelerations, etc.), we will need many words for a process that 
is basically so simple. Adopting the phase-space description of the oscillator, on 
the other hand, means a first step of abstraction because one interprets the momen- 
tum as a new, independent variable, a quantity that is measurable but not directly 
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“visible”. The details of the motion become more transparent and are very sim- 
ple to describe: the oscillation is now a closed curve (lower part of Fig. 1.8) from 
which one directly reads off the time variation of the position and momentum and 
therefore also that of the potential and kinetic energy. 

The transformation to the new variables zi = m^/mq and zi = p/^/m shows 
that in the present example the phase portraits are topologically equivalent to circles 
along which the oscillator moves with constant angular velocity a>. 

1.17.2 The Planar Mathematical Pendulum 

Strictly speaking, the planar pendulum is already a constrained system: a mass 
point moves on a circle of constant radius, as sketched in Fig. 1.9. However, it is 
so simple that we may treat it like a free one-dimensional system and do not need 
the full formalism of constrained motion yet. We denote by (p(t) the angle that 
measures the deviation of the pendulum from the vertical and by s(t ) = lip(t) the 
length of the corresponding arc on the circle. We then have 

T — fill's 2 — j ml 2 ip 2 , 

r* r<p 

U — mg sin <p' d.v' = mgl I sin <p' dip' , or 

J o ‘Jo 

U = —mgI[cos(p — 1] . 



We introduce the constants 
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def E 

mgl 



1 



— ~(f> 2 + 1 — cos (p with or — 



2 def g 



2ft) 2 



As in Sect. 1.17.1 we set z\ — cp, r = cot, and zi — <p/co. Then s = z ?/2 + 1 — 
coszi, while the equation of motion mlcp = —mg sin<p reads, in the new variables, 



dzt . , dz 2 . , 

-T- = Z2(r) , — = -smzi(r). 

dr dr 



(1.40) 



In the limit of small deviations from the vertical one has sin n = zi + 0(z\) 
and (1.40) reduces to the system (1.39) of the oscillator. In Fig. 1.10 we sketch 
the potential U (zi) and some phase portraits. For values of s below 2 the picture 
is qualitatively similar to that of the oscillator (see Fig. 1.8). The smaller s , the 




Fig. 1.10. The potential energy U(q) = 1 — cosg of the plane pendulum, as well as a few phase 
portraits p = «J2(e — 1 + cosg), as a function of q and for several values of the reduced energy 
£ = E/mgl. Note that in the text q = z\, P = Z 2 - The values of e can be read off the ordinate 
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closer this similarity. For e > 2 the pendulum always swings in one direction, 
either clockwise or anticlockwise. The boundary s = 2 between these qualitatively 
different domains is a singular value and corresponds to the motion where the 
pendulum reaches the uppermost position but cannot swing beyond it. In Sect. 1.23 
we shall show that the pendulum reaches the upper extremum, which is also an 
unstable equilibrium position, only after infinite time. This singular orbit is called 
the separatrix ; it separates the domain of oscillatory solutions from that of rotating 
solutions. 

Note that in Fig. 1.10 only the interval q e [ — tt, +7r] is physically relevant. 
Beyond these points the picture repeats itself such that one should cut the figure 
at the points marked B and glue the obtained strip on a cylinder. 



1.18 Phase Space for the /i-Particle System (in M 3 ) 



In Sect. 1.16 we developed the representation of one-dimensional mechanical sys- 
tems in phase space. It is not difficult to generalize this to higher-dimensional 
systems such as the n -particle system over R 3 . For this purpose we set 



def (1) 
X\ — X K ' 



def (1) 
X2 — / 



def (i) 

*3 = r 



X4 = * (2) 



%3n 



^ f z (n) 



def (i) 
*3n+l = Px 



def 

X-in+2 = P 



0) 

y 



def 

• • • *6 n — P 



(«) 

z 



This allows us to write the equations of motion in the same compact form (1.37) 
provided one defines 



V=T,4" 

m i 



rr uei A ( 

= P v 



m i 






m n 

f6n=^ n) • 



The original equations 

p(i) _ p(i)^ r (l) r (n) ^.(1) f.(n ) ^ 




mi 



then read 



x — T(x , t) . 



(1.41) 



The variable x — (x \ , xi, ■ ■ ■ , X ( m ) summarizes the 3 n coordinates and 3 n mo- 
menta 



r (,) = ( x (l) , y 0) , z (,) ) , p U) = (p%\ Py \ /^°) , i — 1, . . . , n . 

The «-particle system has 3 n coordinates or degrees of freedom, f — 3 n. (The 
number of degrees of freedom, i.e. the number of independent coordinate variables, 
will always be denoted by /.) 
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x is a point in phase space whose dimension is dimP = 2/ (= 6 n, here). This 
compact notation is more than a formal trick: one can prove a number of important 
properties for first-order differential equations such as (1.41) that do not depend 
on the dimension of the system, i.e. the number of components it has. 



1.19 Existence and Uniqueness 

of the Solutions of x = T(x, t ) 



A striking feature of the phase portraits in Figs. 1.8 and 1.10 is that no two phase 
curves ever intersect. (Point B of Fig. 1.10 seems an exception: the separatrix arriv- 
ing from above, the one departing towards the bottom, and the unstable equilibrium 
meet at B. In reality they do not intersect because B is reached at different times 
in the three cases; see below.) This makes sense on physical grounds: if two phase 
portraits did intersect, on arriving at the point of intersection the system would 
have the choice between two possible ways of continuing its evolution. The de- 
scription by means of (1.41) would be incomplete. As phase portraits do not in fact 
intersect, a single point y e P together with (1.41) fixes the whole portrait. This 
point y, which defines the positions and momenta (or velocities), can be under- 
stood as the initial condition that is assumed at a given time t = s. This condition 
defines how the system will continue to evolve locally. 

The theory of ordinary differential equations gives precise information about 
the existence and uniqueness of solutions for (1.41), provided the function Tix, t) 
fulfills certain conditions. This information is of immediate relevance for physi- 
cal orbits that are described by Newton’s equations. We quote the following basic 
theorem but refer to the literature for its proof (see e.g. Arnol’d 1992). 



Let T{x,t) with x e P and t e R. be continuous and, with respect to x, 
continuously differentiable. Then, for any teP and any .y e R there is a 
neighborhood U of z and an interval I around s such that for all y e U 
there is precisely one curve x(t,s, y) with t in I that fulfills the following 
conditions: 

3 

(t) — x(t,s, y) = F[x{t,s,y),t] , 

at ' 

(ii) x(t = s , s, y) = y , (1-42) 

(iii) x(t,s,y ) has continuous derivatives in t, s, and y. 

y is the initial point in phase space from which the system starts at time 
t — s. The solution x(t, s , y) is called the integral cur\>e of the vector field 



For later purposes (see Chap. 5) we note that T (x , t) = T t ( x ) can be under- 
stood to be a vector field that associates to any x the velocity vector x = T, (x ) . 
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This picture is a useful tool for approximate constructions of solution curves in 
phase space in those cases where one does not have closed expressions for the 
solutions. 



1.20 Physical Consequences 

of the Existence and Uniqueness Theorem 



Systems described by the equations of motion (1.41) have the following important 
properties: 

(i) They are finite dimensional , i.e. every state of the system is completely de- 
termined by a point z in P. The phase space has dimension 2/, where / is 
the number of degrees of freedom. 

(ii) They are differential systems, i.e. the equations of motion are differential 
equations of finite order. 

(iii) They are deterministic, i.e. the initial positions and momenta determine the 
solution locally (depending on the maximal neighborhood U and maximal 
interval I ) in a unique way. In particular, this means that two phase curves 
do not intersect (in U and I). 

Suppose we know all solutions corresponding to all possible initial conditions, 



x(t,s,y) = 0 t ,Ay). (1-43) 

This two-parameter set of solutions defines a mapping of P onto P, j t = 
<P t ,s(y)- This mapping is unique, and both it and its inverse are differentiable. 
The set <Pt,s(y) is called the flow in phase space P. 

Consider a system whose initial configuration at time s is y e P. The flow 
describes how the system will evolve from there under the action of its dynamics. 
At time t it takes on the configuration x, where t may be later or earlier than 
s. In the first case we find the future evolution of the system, in the second we 
reconstruct its past. As is customary in mathematics, let the symbol o denote the 
composition of two maps. For example, 

x y = fix) i-» z = g(y) or x\ — > z = g{f(x)). 
f 8 8°f 



With the times r,s,t in the interval 1 we then have 

&t,s ° ®s,r = ®t,r , 



d 

— Ft o @r,s 

at ' 



&s,s = 1 , 

def 9 

with Ft — — @1 

- df " 



For autonomous systems, i.e. for systems where T does not depend explicitly on 
time, we have 



x(t + r,s + r,y) = x(t,s,y), or 0 t+r , s+r = = &t-s • (1-44) 

In other words, such systems are invariant under time translations. 
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Proof. Let t' = t + r, s' = s + r. As 3/3 1 — 3/3 t', we have 
0 

—x(t + r = t ' ; s + r = s', y) = f{x(f, s', >’)) 

with the initial condition 

x(s' , s', y) — x(s + r, s + r, y) = y . 

Compare this with the solution of 

3 , . 

—x(t,s,y) = J r (x(t,s,y)) with x(,s, s, y) — y . 
at ~ • ’ ~ 

From the existence and uniqueness theorem follows 

x(t+r, s+r, v) = x(t, s, y) . □ 

In principle, for a complete description of the solutions of (1.41) we should 
add the time variable as an additional, orthogonal coordinate to the phase space 
P. If we do this we obtain what is called the extended phase space Pxl t , whose 
dimension is (2/+ 1) and thus is an odd integer. As time flows monotonously and 
is not influenced by the dynamics, the special solution (x(t), t ) in extended phase 
space Pxl| contains no new information compared to its projection x(t) onto 
phase space P alone. Similarly, the projection of the original flow {</>?, s (y), t j in 
extended phase space PxR t onto P is sufficient to give an almost complete image 
of the mechanical system one is considering. 

Figure 1.10, which shows typical phase portraits for the planar pendulum, yields 
a particularly instructive illustration of the existence and uniqueness theorem. 
Given an arbitrary point y — (q, p), at arbitrary time s, the entire portrait passing 
through this point is fixed completely. Clearly, one should think of this figure as 
a three-dimensional one, by supplementing it with the time axis. For example, a 
phase curve whose portrait (i.e. its projection onto the ( q , p)- plane) is approxi- 
mately a circle in this three-dimensional space will wind around the time axis like 
a spiral (make your own drawing!). The point B, at first, seems an exception: the 
separatrix (A) corresponding to the pendulum being tossed from its stable equi- 
librium position so as to reach the highest position without “swinging through”, 
the separatrix ( B ), which starts from the highest point essentially without initial 
velocity, and the unstable equilibrium (C) seem to coincide. This is no contradic- 
tion to Theorem 1.19, though, because (A) reaches the point B only at t — +oo, 
( B ) leaves at t — — oo, while (C) is there at any finite t. 

We summarize once more the most important consequences of Theorem 1.19. 
At any point in time the state of the mechanical system is determined completely 
by the 2/ real numbers (q\, . . . , qf, p\, . . . , pf). We say that it is finite di- 
mensional. The differential equation (1.41) contains the whole dynamics of the 
system. The flow, i.e. the set of all solutions of (1.41), transports the system from 
all possible initial conditions to various new positions in phase space. This trans- 
port, when read as a map from P onto P, is bijective (i.e. it is one to one) and 
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is differentiable in either direction. The flow conserves the differential structure 
of the dynamics. Finally, systems described by (1.41) are deterministic: the com- 
plete knowledge of the momentary configuration (positions and momenta) fixes 
uniquely all future and past configurations, as long as the vector field is regular, 
as assumed for the theorem. 5 



1.21 Linear Systems 

Linear systems are defined by T — Ax + h. They form a particularly simple class 
of mechanical systems obeying (1.41). We distinguish them as follows. 



1.21.1 Linear, Homogeneous Systems 



Here the inhomogeneity b is absent, so that 
x — Ax, where A — [ciik\ > 
or, written in components, 

Xi ='Y^<*ikXk ■ (1.45) 

k 

Example. The harmonic oscillator is described by a linear, homogeneous equation 
of the type (1.41), viz. 




The explicit solutions of Sect. 1.17.1 can also be written as follows: 
x\(t) = x° cos r + sin r , X 2 (t) — —x®mco sin r + xj cos r . 

Set r = co(t — s) and 




Then 



x(t) = x(t, s, y ) = ‘Pt.siy) — M(t, s) ■ y , with 



M(t,s) 



cos co(t — s ) 
—into sin<a(r — s) 



— sina)(t — s) 

mco v / 

COS CO (t — s ) 



(1.47) 



One confirms that cp Ls and M ( t , ,v) depend only on the difference (t — 5 ). This 
must be so because we are dealing with an autonomous system. It is interesting 
to note that the matrix M has determinant 1. We shall return to this observation 
later. 

5 Note that the existence and uniqueness is guaranteed only locally (in space and time). Only in 
exceptional cases does the theorem allow one to predict the long-term behavior of the system. 
Global behavior of dynamical systems is discussed in Sect. 6.3. Some results can also be obtained 
from energy estimates in connection with the virial, cf. Sect. 1.31 below. 
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1.21.2 Linear, Inhomogeneous Systems 

These have the general form 

x — Ax + b . (1-48) 

Example. Lorentz force with homogeneous fields. A particle of charge e in external 
electric and magnetic fields is subject to the force 

K = -r x B + eE . (1.49) 

c 

In the compact notation we have 



x\ = X , X 2 = y , X3 = Z , X4 = Px , *5 = Py , *6 = Pz ■ 



Let the magnetic field point in the “-direction, B — Be z , i.e, rxB — ( yB , — xB, 0). 
Setting K — eB/mc we then have x = Ax + b, with 



/ 0 0 0 1/m 0 0 \ 

0 0 0 0 1/m 0 

0 0 0 0 0 1/m 

0 0 0 0 K 0 

0 0 0 -^0 0 

v 0 0 0 0 0 0 ) 



t°:\ 



\ I J 



(1.50) 



For a complete treatment of linear systems we refer to the mathematical literature 
(see e.g. Arnol’d 1992). Some aspects will be dealt with in Sects. 6.2.2 and 6.2.3 
in the framework of linearization of vector fields. A further, important example is 
contained in Practical Example 2.1 ( small oscillations ). 



1.22 Integrating One-Dimensional Equations of Motion 

The equation of motion for a one-dimensional, autonomous system reads mq = 
K(q). If K iq) is a continuous function it possesses a potential energy 

U (q) — — f K(q') dq' , 

Jqo 

so that the law of energy conservation takes the form 
^ mq 2 + U (q) = E — const . 

From this follows a first-order differential equation for q(t): 






(1.51) 
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This is a particularly simple example for a differential equation with separable 
variables whose general form is 



df _ g(y) 
dx fix) 



(1.52) 



and for which the following proposition holds (see e.g. Arnol’d 1992). 

Theorem. Assume the functions f(x) and g(y) to be continuously differentiable 
in a neighborhood of the points xo and yo , respectively, where they do not vanish, 
f (xo) ^ 0, g(yo) 7 ^ 0- The differential equation (1.52) then has a unique solution 
y — F(x) in the neighborhood of xo that fulfills the initial condition yo = Fix o) 
as well as the relation 



r x dx' 

l 0 W) 




dv' 

#(/) ’ 



(1.53) 



When applied to (1.51) this means that 



n n n {t) d q' 

1 - t0 - V 2 J qo ^E-U(q') ’ 



(1.54) 



that is, we obtain an equation which yields the solution if the quadrature on the 
right-hand side can be carried out. The fact that there was an integral of the mo- 
tion (here the law of energy conservation) allowed us to reduce the second-order 
equation of motion to a first-order differential equation that is solved by simple 
quadrature. 

Equations (1.54) and (1.51) can also be used for a qualitative discussion of 
the motion: since T + U — E and since T must be T > 0, we must always 
have E > U (q). Consider, for instance, a potential that has a local minimum 
at q — qo, as sketched in Fig. 1.11. At the points A, B, and C, E — U(q). 
Therefore, solutions with that energy E must lie either between A and B. or beyond 
C, qA < q(t) < q B , or q(t) > q c . 




Fig. 1.11. Example of potential energy in one di- 
mension. From energy conservation the kinetic en- 
ergy must vanish in A, B, and C, for a given total 
energy E. The hatched areas are excluded for the 
position variable q 
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As an example, we consider the first case. Here we obtain finite orbits; the 
points A and B are turning points where the velocity q passes through zero, ac- 
cording to (1.51). The motion is periodic, its period of oscillation being given by 
T(E ) = 2x (running time from A to B). Thus 



T(E) = 



— V2 m 

J a 



d q' 

q A (E) yjE- U(q') 



1.23 Example: The Planar Pendulum 

for Arbitrary Deviations from the Vertical 

Figure 1.12 shows the maximal deviation <po < n. According to Sect. 1.17.2 the 
potential energy is U (cp) = mgl( 1 — cos (p). For <p — (po the kinetic energy vanishes, 
so that the total energy is given by 

E — mgl(l — cos^o) = mgl( 1 — cos^) + j ml 2 (p 2 . 




Fig. 1.12. Plane mathematical pendulum for an arbitrary deviation 
VO e [0. n-] 



The period is obtained from (1.55), replacing the arc s — hp by <p: 
T — 2V2 m / I dcp/^/mgl (cos (p — coscpo ) . 

Jo 

With cost p = 1 — 2shr(<p/2) this becomes 






(1.56) 



(1.560 
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Substituting the variable (p as follows: 



def sin(<p/ 2 ) 
sin a = 



sin(<p 0 / 2 ) 

one obtains 

dip — 2 da sin(^o/ 2 ))y 1 — sin 2 a j J 1 — sin 2 (^o/ 2 ) si 
<p — 0 — > a — 0 , 

<p — <po —>■ a — n /2 , 
and therefore 



sin 2 a 



r = 4 Vg [ Sin ( W/ 2 )] ’ 



(1.57) 



where K(z) = da/\J 1 — z 2 sin 2 a denotes the complete elliptic integral of 
the first kind (see e.g. Abramowitz, Stegun 1965). 

For small and medium-sized deviations from the vertical, one can expand in 
terms of z = sin(<po/ 2 ) or directly in terms of (fio/2: 

2 i 4 

li 2 - ^ \ — 1/2 . -2 ^ .4 JZ 

( 1—2 sin a) ~ 1 + sin a |- sin a 

2 8 

-1+2 sin2 “ (l^o - Js^o) + I sin4 «B<Po • 

The integrals that this expansion leads to are elementary, viz. 



rx/2 

L ' 



sin 2 " x d.v = — I n — - 



7 r 



2 n! 



3 

n 

2 



0 n = 1 , 2 , ...) . 



Thus, one obtains 






1 + ^Z 2 + ^ 4 
4 64 



1 



16’ 



9 



1 



1 + —cpi + 322 

1 c. 1 1 z;/i n u 



64 12 7 16 



<Po 



and, finally. 



T ~ 2jt 



IL 

8 L 



1 



16 



^0 



11 

3072 



(1.58) 



The quality of this expansion can be judged from a numerical comparison of suc- 
cessive terms as shown in Table 1.1. 

The behavior of T (1.57) in the neighborhood of ipo — n can be studied sep- 
arately. For that purpose one calculates the time 1 4 that the pendulum takes to 
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Table 1.1. Deviation from the harmonic approximation 



<P0 


1 ,*2 


11 


16^0 


3072 * 0 ) 


10° 


0.002 


3 x 10“ 6 


20° 


0.0076 


1 x 10“ 4 


45° 


0.039 


1.4 x 10 -3 



swing from ip — jt — A to ip = (po — jt — s, where e <£ A, cf. Fig. 1.13. Introduce 
x = Jt — <p as a new variable and let T (0 ) = 2 jty/TJq. Then 



where we have approximated cos x by 1 — x 2 /2. For <po -* Jt, i.e. for s — > 0, 
tA tends to infinity logarithmically. The pendulum reaches the upper (unstable) 
equilibrium only after infinite time. 



f 



d.r 1 A 

7 = —In 2 — , 

y/x 2 — s 2 Jt £ 



(1.59) 



tA 1 f A d.r 1 

T (0) jt\fl Js V cos e — cos x Jt 




i 



Fig. 1.13. The plane pendulum for large deviations, say (pQ = jt — e, where 
s is small compared to 1. In the text we calculate the time the pendulum 
needs to swing from (p = n — A to the maximal value (pQ . One finds that 
§ oes t° infinity like — ln£, as one lets s tend to zero 



It is interesting to note that the limiting case E — 2m g l (unstable equilibrium or 
separatrix) can again be integrated by elementary means. Returning to the notation 
of Sect. 1.17.2, the variable zi — (p now obeys the differential equation 

^-bd-cos.H 2 , or ^ = y2(l + coszi) • 

def 

Setting u — tan(zi/2), we find the following differential equation for u: 
du/y/ u 2 + 1 = dr , 

which can be integrated directly. For example, the solution that starts at zi = 0 at 
time r = 0 fulfills 
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I du'/y/ u' 2 + 1 = / dr' , and hence ln(w + y/ u 2 + l) = r . 

Jo Jo 

With u — ( e z — e~ T )/2, the solution for z,i is obtained as follows: 
z i(r) = 2 arctan(sinh r) . 

If we again choose zi = tt — e, i.e. u = cot e/2 ~ 2/e, we have u + V u 2 + 1 ~ 
4/e and r(e) — ln(4/e). The time to swing from zi = 0 to z\ — n diverges 
logarithmically. 



1.24 Example: The Two-Body System with a Central Force 



Another important example is the two-body system (over M 3 ) with a central force, 
to which we now turn. It can be analyzed in close analogy to the one-dimensional 
problem of Sect. 1.22. 

The general analysis of the two-body system was given in Sect. 1.7. Since the 
force is supposed to be a central force (assumed to be continuous), it can be derived 
from a spherically symmetric potential U(r). The equation of motion becomes 



m\im 

!ir = -VU{r), with fi = (1.60) 

m i + 

r = r \ — r 2 is again the relative coordinate and r = |r|. If the central force reads 
F — F(r)r , the corresponding potential is U (r) — — f r F (r')dr' . The motion 
takes place in the plane perpendicular to the conserved relative orbital angular 
momentum Z re i = r x p. Introducing polar coordinates in that plane, x — r cos <p 
and y — r simp, one has r 2 = r 2 + r 2 (p 2 . 

The energy of relative motion is conserved because no forces apply to the center 
of mass and therefore total momentum is conserved: 



T s + E = 



P 2 

2 M 




(r 2 + r 2 (p 2 ) + U{r) — const . 



(1.61) 



Thus, with l = \l\ = ixr 2 <p. 



E = 




+ U{r) — const . 

2 fir 1 



(1.62) 



dof • o o o 

T r — fir /2 is the kinetic energy of radial motion, whereas the term / /2fir~ — 
fir 2 Jr / 2 can be read as the kinetic energy of the rotatory motion, or as the potential 
energy pertaining to the centrifugal force, 



Z = -V 



„9/l ■ 2 * fl i * 

-r — I -fir <p J = -fircp-r = ~-v r r . 



From angular-momentum conservation 
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/ = pr 2 q> — const , (1.63) 

and from energy conservation (1.62) one obtains differential equations for r(t) and 
(p(t): 



(1.64) 

d(» 1 

— = — . , with (1.65) 

dt pr z 

, f z 2 

U eS ir) = U{r) + — , (1.66) 

where the latter, U e ff(r), can be interpreted as an effective potential. When written 
in this form the analogy to the truly one-dimensional case of (1.51) is clearly 
visible. Like (1.51) the equation of motion (1.64) can be solved by separation of 
variables, yielding r as a function of time t. This must then be inserted into (1.65), 
whose integration yields the function (pit). Another way of solving the system of 
equations (1.64) and (1.65) is to eliminate the explicit time dependence by dividing 
the second by the first and by solving the resulting differential equation for r as 
a function of q>, viz. 



d <p 1 

d7 " rV2 n(E - U^j ' 



(1.67) 



This equation is again separable, and one has 



cp-cpo 




dr 

r 2 V2/4£ - £/eff7 ' 



( 1 . 68 ) 



Writing E — T r +U e ff(r ), the positivity of T r again implies that E > l/ e ff (r). Thus, 
if r(t) reaches a point n, where E = t/ e ff(n)> the radial velocity r(r\) vanishes. 
Unlike the case of one-dimensional motion this does not mean (for / ^ 0) that the 
particle really comes to rest and then returns. It rather means that it has reached a 
point of greatest distance from, or of closest approach to, the force center. The first 
is called perihelion or, more generally, pericenter, the second is called aphelion 
or apocenter. It is true that the particle has no radial velocity at r \ but, as long as 
l ^0, it still has a nonvanishing angular velocity. 

There are various cases to be distinguished. 

(i) r{t) > r m i n = rp (“P” for "perihelion”). Here the motion is not finite; 
the particle comes from infinity, passes through perihelion, and disappears again 
towards infinity. For an attractive potential the orbit may look like the examples 
sketched in Fig. 1.14. For a repulsive potential it will have the shape shown in 
Fig. 1.15. In the former case the particle revolves about the force center once or 
several times; in the latter it is repelled by the force center and will therefore be 
scattered. 
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Fig. 1.14. Various infinite orbits for an attractive potential 
energy. P is the point of closest approach (pericenter or 
perihelion) 





Fig. 1.15. Typical infinite orbit for a repulsive central potential 



(ii) r m ; n = rp < r(t) < r max = r A (“A” for “aphelion”). In this case the 
entire orbit is confined to the circular annulus between the circles with radii rp 
and r A . In order to construct the whole orbit it is sufficient to know that portion of 
the orbit which is comprised between an aphelion and the perihelion immediately 
succeeding it (see the sketch in Fig. 1.16). Indeed, it is not difficult to realize that 
the orbit is symmetric with respect to both the line S A and the line SP of Fig. 1.16. 
To see this, consider two polar angles Acp and — Acp , with Acp — cp — ( p a, that 
define directions symmetric with respect to S A, see Fig. 1.17, with 



Acp = l 




dr 

r 2 V2/4£ - 



One has 



t/eff(r) = l/ e ff (r A ) + (C/eff(r) - £/eff(r A )) = E + (t/ eff (r) - t/ eff (r A )) 
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Fig. 1.16. Bound, or finite, orbit for an attractive central 
potential. The orbit has two symmetry axes: the line SA 
from the force center S to the apocenter, and the line 
SP from S to the pericenter. Thus, the entire rosette or- 
bit can be constructed from the branch PA of the orbit. 
(The curve shown here is the example discussed below, with 
a = 1 . 3 , b = 1 . 5 .) 




Fig. 1.17. Two symmetric positions before and after passage through 
the apocenter 




Instead of moving from A to Ci, by choosing the other sign of the square root in 
(1.69), the system may equally well move from A to C 2 . From (1.67) this means 
that one changes the direction of motion, or, according to (1.64) and (1.65), that 
the direction of time is reversed. As r(ip) is the same for +Aip and —A(p, we 
conclude that if C\ — [r(pp), ip — ip a A- is a point on the orbit, so is C 2 = 
[r(q>), ip — ip a~ AipY A similar reasoning holds for P . This proves the symmetry 
stated above. 

We illustrate these results by means of the following example. 

Example. A central potential of the type U(r) — —a/r 01 . Let (r, ip) be the polar 
coordinates in the plane of the orbit. Then 




d tp l 
df per 2 



(1.71) 




1.24 Example: The Two-Body System with a Central Force 



51 



def 

Since we consider only finite orbits for which E is negative, we set B — — E. 
We introduce dimensionless variables by the following definitions: 

, def V/f B , def B 

6(r) = — - — r(t) , r = —t . 

The equations of motion (1.70) and (1.71) then read 

d£ . [2b I 



dr 



= ±, 



j 2b 

Q a 



Q 



2 , 



1 



A(p 
dr q 2 

where we have set 



^ def a / x ! ! B 

7 “ B l l 



(1.70') 

(1.710 



The value a — 1 defines the Kepler problem, in which case the solutions of (1.700 
and (1.7T) read 



g(<p) — 1/^(1 + ecos(^) — <po)) with s = J 1 — 2 /b 2 . 

The constant (po can be chosen at will, e.g. (po — 0. Figures 1.18-1.22 show 
the orbit g((p) for various values of the parameters a and b. Figure 1.18 shows two 
Kepler ellipses with b = 1.5 and b — 3. Figures 1.19, 1.20 illustrate the situation 
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Fig. 1.21. Example of a rosette orbit that “stays be- 
hind”, with a = 0.9, b = 2 



for a > 1 where the orbit “advances” compared with the Kepler ellipse. Similarly, 
Figs. 1.21, 1.22, valid for a < 1, show it “staying behind” with respect to the 
Kepler case. In either case, after one turn, the perihelion is shifted compared with 
the Kepler case (a = 1) either forward ( a > 1) or backward ( a < 1). In the 
former case there is more attraction at perihelion compared to the Kepler ellipse, 
in the latter, less, thus causing the rosette-shaped orbit to advance or to stay behind, 
respectively. 

Remark: From the above exercise it seems plausible that finite orbits which 
close after a finite number of revolutions about the origin are the exception rather 
than the rule. For this to happen the angle (p between the straight lines SA and 
SP of Fig. 1.16 must be a rational number times 2i r, cp = (n/p)( 2it), n, p e N, 
where p is the number of branches PA of the orbit needed to close it, and n is 




Fig. 1.22. A rosette orbit as in Fig. 1.21 but with a = 0.8, 
b = 3 
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the number of turns about S. For example, in the case of the bound Kepler orbits 
we have p = 2 and n — 1. This is a very special case insofar as the points A, S 
and P lie on a straight line, with A and P separated by S. 

The following theorem answers the even more restrictive question as to whether 
all finite orbits in a given central potential close: Bertrand’s theorem 6 : The central 
potentials U (r) = a/ r with a < 0 and U ( r ) = br 2 with b > 0 are the only ones 
in which all finite (i.e. bound) orbits close. 

As we know in the first case, it is the orbits with negative energy which close; 
they are the well-known ellipses or circles of the Kepler problem. In the second 
case all orbits are closed and elliptical. 

Remarks: The examples studied in this section emphasize the special nature 
of the Kepler problem whose bound orbits close after one turn around the center of 
force. The rosette-like orbit represents the generic case while the ellipse (or circle) 
is the exception. This property of the attractive 1/r-potential can also be seen if 
instead of the plane of motion in R 3 we study the motion in terms of its momentum 
p — ( p x , p y ) T . The solution (1.21) for abitrary orientation of the perihelion 



r(f) = 



P 

1 + ecos (0(f) - </>o) 



when decomposed in terms of Cartesian coordinates (x. y) in the plane of motion, 
reads 

P 



x(t) = 

y(t ) = 



1 + ecos (<j>(t) - <po) 
P 



COS (0(f) - <j) o) , 
sin (0(f) - 0o ) • 



1 + ecos(0(O - 0o ) 

The derivatives of x(t) and of y ( t ) with respect to time are 



x(t) = -p 
y(t) = p 



sin (0 - 0o ) 



[1 + ecos(0(O — 

cos (0 - 0 O ) + e 



1 

P 



— — 0 = - - (r 0) sin(0 - 0 O ) 

0o)r 

1 



[1 + e cos (0(0 



? 0 = — (r 2 0)[cos(0 - 0 O ) + e] . 
0o)h P 



Upon multiplication with the reduced mass //, making use of the conservation 
law (1.19) for £, the modulus of the angular momentum, l — /ir 2 (j), and inserting 
the definition p = t 2 /( A/i), one obtains 



p x = nx = — — sin(0 - 0o ) , 

A} u , , 

Py = Ty = — {cos(0 - 0 O ) + e} . 



In a two-dimensional space spanned by p x and p Y this solution is a circle about 
the point 

6 J. Bertrand (1873): R. Acad. Sci. 77, p.849. The proof of the theorem is not too difficult. For 
example, Arnol’d proposes a sequence of five problems from which one deduces the assertion, 
(Arnol’d, 1989). 
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(o, e(Afi/t)j = (0, yJiAfx/l) 2 + 2fiE) , 



where we have inserted the definition of the excentricity, s = yj 1 + 2 El 2 / fiA 2 . 
The radius of this circle is R = Afi/l. The bound orbits in the space spanned 
by p x and p y are called hodographs. In the case of the Kepler problem they are 
always circles. 

This remarkable result is related to another constant of the motion, the Her- 
mann-Bernoulli-Laplace-Lenz vector, that applies to the 1/r potential. We will 
show this in the framework of canonical mechanics in Exercise 2.31. 



1.25 Rotating Reference Systems: 

Coriolis and Centrifugal Forces 

Let K be an inertial system and K' another system that coincides with K at time 
t — 0 and rotates with angular velocity <x> — |w| about the direction d> — w/o>, as 
shown in Fig. 1.23. Clearly, K' is not an inertial system. The position vector of a 
mass point is r(t) with respect to K and r'{t) with respect to K', with r{t) = r'(t). 
The velocities are related by 

v' = v — to x r' , 

where v' refers to K' and v to K. Denoting the change per unit time as it is observed 
from K' by d'/d f, this means that 

d' d d d' 

— r — — r — co x r or — r = — r + u> x r , 

d t dr dr dr 

where 

d/dr: time derivative as observed from K , 
d'/dr: time derivative as observed from K' . 




Fig. 1.23. The coordinate system K' rotates about the sys- 
tem K with angular velocity <o 
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The relation between dr /dt and d'r/dt must be valid for any vector- valued func- 
tion a(t), viz. 



d d' 

— a = — a + oo x a . 

dr dr 



(1.72) 



Taking oo to be constant in time, we find that the relationship (1.72) is applied to 
the velocity a(t) = dr /dr as follows: 



d 2 d' f dr \ 

dr 2 dr \ dr / 



dr d' ( d' 

1 + oo x — = — I — r + oo x r 

dr dr \dr 



( 0 X 



d' 

— t 

dr 



oo x r 



d 2 ' d' 

= — ~r + 2oo x — r + oo x (« x r) . 

dr 2 dr 



(1.73) 



(If oo does depend on time, this equation contains one more term, (d’oo/dt) x r = 
(i doo/dt ) x r — oo x r.) 

Newton’s equations are valid in K because K is inertial; thus 
d 2 

m- ~2 r ^ = F ■ 

Inserting the relation (1.73) between the acceleration d 2 r/dr 2 , as seen from K, 
and the acceleration d 2, r/dr 2 , as seen from K', in the equation of motion, one 
obtains 



d 2 ' d' 

m — — F — 2 moo x — r — moo x {oo x r) 

dr 2 dr 



(1.74) 



When observed from K', which is not inertial, the mass point is subject not only 
to the original force F but also to the 



Coriolis force C — —2 moo x v' 
and the 

centrifugal force Z = — moo x (oo x r) , 
whose directions are easily determined from these formulae. 



(1.75) 



(1.76) 



1.26 Examples of Rotating Reference Systems 

Example (i) Any system tied to a point on the earth may serve as an example of a 
rotating reference frame. Referring to the notation of Fig. 1.24, the plane tangent to 
the earth at A rotates horizontally about the component o> v of oo. In addition, as a 
whole, it also rotates about the component &»], (the tangent of the meridian passing 
through A). If a mass point moves horizontally, i.e. in the tangent plane, only the 
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CJ 




Fig. 1.24. A coordinate system fixed at a point A on the 
earth’s surface rotates about the south-north axis with an- 
gular velocity co = <y v + 



component w v will be effective in (1.75). Thus, in the northern hemisphere the 
mass point will be deviated to the right. 

For vertical motion, to a first approximation, only W|, is effective. In the north- 
ern hemisphere this causes an eastward deviation, which can easily be estimated 
for the example of free fall. For the sake of illustration, we calculate this deviation 
in two different ways. 

(a) With respect to an inertial system fixed in space. We assume the mass point 
m to have a fixed position above point A on the earth’s surface. This is sketched 
in Fig. 1.25, which shows the view looking down on the north pole. The particle’s 
tangent velocity (with reference to K!) is vj(R+h) = { R+h)oL> cos ip. At time t = 0 
we let it fall freely from the top of a tower of height H. As seen from K, m moves 
horizontally (eastwards) with the constant velocity vj(R + H) — (R + H)co cos <p, 
while falling vertically with constant acceleration g. Therefore, the height H and 
the time T needed to reach the ground are related by H — \gT 2 . If at the same 
time (t — 0) the point A at the bottom of the tower left the earth’s surface along 
a tangent, it would move horizontally with a constant velocity vj( R) = Ro> cos (p. 
Thus, after a time T, the mass point would hit the ground at a distance 

Aq — (vt(R + H) — (R))T = HcoT cos ip , 




Fig. 1.25. A body falling down vertically is deviated towards 
the east. Top view of the north pole and the parallel of latitude 



of A 
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east of A. In reality, during the time that m needs to fall to the earth, the tower 
has continued its accelerated motion, in an easterly direction, and therefore the real 
deviation A is smaller than Aq. At time t, with 0 < t < T, the horizontal relative 
velocity of the mass point and the tower is (i>t (R + H) — vt(R + H — \gt 2 )'j — 
l^gcot 2 cos (p. This must be integrated from 0 to T and the result must be subtracted 
from Aq. The real deviation is then 




(b) In the accelerated system moving with the earth. We start from the equation 
of motion (1.74). As the empirical constant g is the sum of the gravitational accel- 
eration, directed towards the center of the earth, and the centrifugal acceleration, 
directed away from it, the centrifugal force (1.76) is already taken into account. 
(Note that the Coriolis force is linear in co while the centrifugal force is quadratic 
in co. In the range of distances and velocities relevant for terrestial problems both 
of these are small as compared to the force of attraction by the earth, the cen- 
trifugal force being sizeably smaller than the Coriolis force.) Thus, (1.74) reduces 
to 



d ,2 r 

m — T = — mge v — 2 mat 
d t z 




(1.74') 



We write the solution in the form r{t) — r (0 ^(t) + cou(t ), where r {0) (t) — (H — 
\ gt 2 )e v is the solution of (1.740 without the Coriolis force (co — 0). As co — 
2n/(\ day) = 7.3 x 10~ 5 s — 1 is very small, we determine the function u(t) from 
(1.740 approximately by keeping only those terms independent of co and linear in 
co. Inserting the expression for r(t) into (1.740, we obtain for u(t) 

d' 2 

ma>— 2 « ~ 2 mgt(o(d> x e v ) . 

<i> is parallel to the earth’s axis, e v is vertical. Therefore, (co x e v ) = cos cpe e , where 
e e is tangent to the earth’s surface and points eastwards. One obtains 



d 2 

— r-i< ~ 2gt cos cpe e , 
dr- 



and, by integrating twice, 
u ~ ^ gt 3 cos cpe t . 

Thus, the eastward deviation is A ~ | gT 3 co cos cp , as above. 
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Inserting the relation between T and //, we get 

A ~ 7 ^cog- 1 / 2 H 3/2 cos <p ~ 2.189 x I 0 _5 // 3/2 cos <p . 

For a numerical example choose H = 160 m, <p — 50°. This gives A ~ 2.8cm. 

Example (ii) Let a mass m be connected to a fixed point O in space and let it 
rotate with constant angular velocity about that point, as shown in Fig. 1.26. Its 
kinetic energy is then T — \ mR 2 co 2 . If we now cut the connection to O . m will 
leave the circle (O: R) along a tangent with constant velocity Rco. How does the 
same motion look in a system K' that rotates synchronously? 




Fig. 1.26. A mass point rotates uniformly about the ori- 
gin O. K.' is a coordinate system in the plane that rotates 
synchronously with the particle 



From (1.74) one has 

d 2 ' d' , . . , 2 

m — rf = 2 mco — ( x 7 e , — x,e 7 ) + mat r, 
dr 2 dr 



or, when written in components, 

d 2/ , d', 2 , 

m — ri, = 2nift> — Xn + mco x , 
dr 2 1 dr 2 1 



dr 2 " 



d . 2 / 

—2mco — x, +mco~x 1 
dr 1 



The initial condition at r = 0 reads 



r = 0 



x\ = R — x\ — 0 , 

1 dr 

, d' , 

xi, — 0 — x ? = 0 . 

2 dr 2 



With respect to K we would then have 
xi(r) = R , X2 (r) = Root . 



Therefore, the relationships 
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x\ — xi cos cot + X2 sin cot , x' 2 — —x\ sin cot + X2 cos cot 



give us at once the solution of the problem, viz. 



x[ it) — R cos cot + Rcot sin cot , x' n (t) = —R sin cut + Rcot cos cot . 



It is instructive to sketch this orbit, as seen from K'. and thereby realize that uni- 
form rectilinear motion looks complicated when observed from a rotating, nonin- 
ertial system. 

Example (iii) A particularly nice example is provided by the Foucault pen- 
dulum that the reader might have seen in a laboratory experiment or in a sci- 
ence museum. The model is the following. In a site whose geographical latitude 
is 0 < cp < it f2 a mathematical pendulum is suspended in the point with coor- 
dinates (0, 0, l) above the ground, and is brought to swing in some vertical plane 
through that point. Imagine the pendulum to be modeled by a point mass m sus- 
tained by a massless thread whose length is /. In the rotating system K, attached to 
the earth, let the unit vectors be chosen such that e\ points southwards, ei points 
eastwards, while e 3 denotes the upward vertical direction. A careful sketch of the 
pendulum and the base vectors shows that the stress acting on the thread is given 
by 



„ „ , *\ - X2 /, l-X 3 „ 

Z = Z e\ eo H C3 

’ l I I 



where we have normalized the components such that Z is the modulus of this 
vector field, Z = |Z|. Indeed, I 2 = xj + x 2 + (7 — X 3) 2 so that the sum of the 
squares of the coefficients in the parentheses is equal to 1 . Inserting this expression 
in the equation of motion ( 1 . 75 ) and denoting, for simplicity, the time derivative 
i with respect to the rotating system K by a dot, the equation of motion reads 

mr — Z + mg — 2m (co x r) — m <0 x (&> x r) . 



For the same reasons as before we neglect the centrifugal force. With the choice 
of the reference system described above one has 



g = -gh , 


(0 = 0 ) 


/—cos cp\ 
0 


, (o x r = co 


' —X2 sin cp \ 

X\ sin cp + x 3 cos cp 






1 sin ^3 1 




1 — X2 cos cp f 



where co is the modulus of the angular velocity, and cp the geographical latitude. 
Writing the equation of motion in terms of its three components one has 

Z 

mx 1 = — —x\ + 2mcox2 sin cp , 

Z 

mx 2 = ——X2 — 2mco{x\ sin<p + X3 cos cp) , 

Z 

m. Y3 = — (/ — X3) — mg + 2mcox2 cos cp . 
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These coupled differential equations are solved most easily in the case of small 
oscillations. In this approximation set X 3 ~ 0, xj, ~ 0 in the third of these and 
obtain the modulus of the thread stress from this equation. It is found to be 

Z = mg — 2ma>X2 cos . 



Next, insert this approximate expression into the first two equations. For consis- 
tency with the approximation of small oscillations terms of the type XjXk must be 
neglected. In this approximation and introducing the abbreviations 

2 8 

o)q — — , a = co sin (p , 

the first two equations become 
iq = — coqX\ + 2ax2 , 

X 2 = —coqX 2 — 2a.ii ■ 

Solutions of these equations can be constructed by writing them as one complex 
equation in the variable z(t ) = x\(t) + LqCf), 

z(t ) = — z(t) - 2i az(t) . 

The ansatz z(t) = Ce' yl yields two solutions for the circular frequency y, viz. 



yi = -a + y] a 2 + a>y , Y 2 = -a - a 2 + a>^ . 

Below we will study the solutions for these general expressions. The historical 
experiment performed in 1851 by Foucault in the Pantheon in Paris, however, had 
parameters such that a was very small as compared to &>o, a 2 <£ cOq. Indeed, 
given the latitude of Paris, <p — 48.5°, and the parameters of the pendulum chosen 
by Foucault, l = 67 m, m — 28 kg, and, from these, the period T — 16.4 s, one 
obtains 

2n 1 

co 0 = -^ 7 - = 0.383 s , 

a — 2 1Z sin ip — ~ 7X sin(48.5°) = 5.45 x 10 -5 s _1 . 

1 day 86400 

Therefore y\/2 — ~ a ± wo and the solutions read 
z(t) ~ (ci + ic 2 )e- i( “-" 0,? + (C 3 + ic 4 )e- i( “ +<uo)f . 



It remains to split this function into its real and imaginary parts and to adjust the 
integration constants to a given initial condition. Suppose the pendulum, at time 
zero, is elongated along the 1 -direction by a distance a and is launched without 
initial velocity, i.e. 



.ri( 0 ) = a , ij( 0 ) = 0 , x 2 ( 0 ) = 0 , ( 0 ) = 0 , 
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the approximate solution is found to be 

*1 (f) ~ a[cos(oit) cos(&>oO + (a /two) sin (a/) sin(cuof)] , 

X 2 (t) — a[— sin(af) cos(cuof) + (a/tuo) cos(at) sin(cwof)] . 

As a result the pendulum still swings approximately in a plane. That plane of os- 
cillation rotates very slowly about the local vertical, in a clockwise direction on the 
northern hemisphere, in a counter-clockwise direction on the southern hemisphere. 
The mark that the tip of the pendulum would leave on the ground is bent slightly 
to the right on the northern hemisphere, to the left on the southern hemisphere. 
For a complete turn of the plane of oscillation it needs the time 24/sin cp hours. 
Rigth on the north pole or on the south pole this time is exactly 24 hours. For 
the latitude of Paris, it is approximately 32 hours, while at the equator there is no 
rotation at all. 

In order to better illustrate the motion of a Foucault pendulum for small am- 
plitudes let us also consider the case where a is not small as compared to the 
unperturbed frequency o>o- For the same initial condition as above, *i(0) = a, 
*i(0) = 0, X 2 (0) = 0 = *2(0), the solution now reads 

*i(f) = fl[cos(af) cos(cwr) + (a/cw) sin(af) sin(Zwr)] , 

X 2 (f) = a[— sin(af) cos(cwt) + (a/c w) cos(at) sin(cwt)] , 

where a> = J c + a 2 . 

It is useful to calculate also the components of the velocity. One finds 

tog _ 

X\ — — a^- cos(at) sin(cwt) , 
tw 

tog _ 

X 2 = a— sin(cnr) sin(cwf) . 

CO 

The two components of the velocity vanish simultaneously at the times 



hi 



nn 

a> 



n — 

T , n =0, 1,2, ... . 

2 



This means that at these points of return both components go through zero and 
change signs, the projection of the pendulum motion on the horizontal plane shows 
spikes. Figure 1.27 gives a qualitative top view of the motion. 

For a quantitative analysis we choose the circular frequency a comparable to 
o>. In the two examples given next these frequencies are chosen relatively rational, 
a/cb = 1/4. (Clearly, this choice is not realistic for the case of the earth and the 
original Foucault pendulum.) For a rational ratio a/cb = n/m the curve swept out 
by the tip of the pendulum on the horizontal plane closes. In all other cases it will 
not close. Figure 1.28 shows the solution given above for the initial condition 



*1(0) = 1 , jci(0) = 0 , * 2 (0) = 0 , * 2 (0) = 0 , 



It closes after four oscillations and exhibits the spikes discussed above. 
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Fig. 1.27. A Foucault pendulum, seen from above, starts 
at the distance a in the South, without initial velocity. 
The mark it makes on the horizontal plane is bent to 
the right and exhibits spikes at the turning points 




Fig. 1.28. Mark left on the horizontal plane by a pendulum that starts from jq(0) = 1 without initial 
velocity. The ratio of circular frequencies is chosen rational, a/cb = 1/4 



Another solution is the following 

xi (f) = a sin(af) sin(ftir) , 
x 2 (0 = acos(ar) sin(<ut) . 



From this one finds the components of the velocity to be 
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Fig. 1.29. Mark swept out by the pendulum when it starts from the equilibrium position and is 
kicked with initial velocity a) in the 2-direction. The ratio of circular frequencies is chosen rational 
and has the same value as in Fig. 1.28 



x i = a[a cos(af) sin(ftif) + w sin (at) cos(&>f)] , 

X 2 = a [— a sin(orf) sin(Zpf) + <y cos(af) cos(&jr)] , 
which corresponds to the initial condition 

.ri(0) = 0 , ii(0) = 0 , X-2 (0) = 0 , X2(0) = aw . 

This solution is the one where the pendulum starts at the equilibrium position and 
is being kicked in the 2-direction with initial velocity aw. At the points of return 
x \+ x 2 — fl2 sin 2 (<ut) is maximal. Thus, they occur at times t n — (2 n + l)n /(2w), 
and one has 

xi(t n ) — aa (— )" cos((2» + l)(a/w)(n/2)) , 

X 2 (t n ) = —aa(—)" sin((2n + 1 )(«/&>) (7 t/2)) . 

This means that the track on the horizontal plane exhibits no more spikes but is 
“rounded off’ at these points. This solution is illustrated by Fig. 1.29 for the case 
of the same rational ratio of a and w as in the previous example. 



1.27 Scattering of Two Particles that Interact 
via a Central Force: Kinematics 

In our discussion of central forces acting between two particles we have touched 
only briefly on the infinite orbits, i.e. those which come from and escape to infin- 
ity. In this section and in the two that follow we wish to analyze these scattering 
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orbits in more detail and to study the kinematics and the dynamics of the scatter- 
ing process. The description of scattering processes is of central importance for 
physics at the smallest dimensions. In the laboratory one can prepare and identify 
free, incoming or outgoing states by means of macroscopic particle sources and 
detectors. That is, one observes the scattering states long before and long after the 
scattering process proper, at large distances from the interaction region, but one 
cannot observe what is happening in the vicinity of the interaction region. The out- 
come of such scattering processes may therefore be the only, somewhat indirect, 
source of information on the dynamics at small distances. To quote an example, 
the scattering of m-particles on atomic nuclei, which Rutherford calculated on the 
basis of classical mechanics (see Sect. 1.28 (ii) and Sect. 1.29) was instrumental 
in discovering nuclei and in measuring their sizes. 

We consider two particles of masses m \ and m 2 whose interaction is given by 
a spherically symmetric potential U (r) (repulsive or attractive). The potential is 
assumed to tend to zero at infinity at least like 1/r. In the laboratory the experiment 
is usually performed in such a way that particle 2 is taken to be at rest (this is the 
target ) while particle 1 (the projectile ) comes from infinity and scatters off particle 
2 so that both escape to infinity. This is sketched in Fig. 1.30a. This type of motion 
looks asymmetric in the two particles because in addition to the relative motion it 
contains the motion of the center of mass, which moves along with the projectile 
(to the right in the figure). If one introduces a second frame of reference whose 
origin is the center of mass, the motion is restricted to the relative motion alone 





Fig. 1.30. (a) Projectile 1 comes from infinity and scatters off target 2, which is initially at rest, 
(b) The same scattering process seen from the center of mass of particles 1 and 2. The asymmetry 
between projectile and target disappears 
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(which is the relevant one dynamically) and one obtains the symmetric picture 
shown in Fig. 1.30b. Both, the laboratory system and the center-of-mass system 
are inertial systems. We can characterize the two particles by their momenta long 
before and long after the collision, in either system, as follows: 



in the laboratory system: 

Pi before, /;' after the collision, i — 1,2; 

in the center-of-mass system: 

q* and —q* before, q'* and —q'* after the collision. 



If we deal with an elastic collision, i.e. if the internal state of the particles does 
not change in the collision, then p 2 — 0 and energy conservation together imply 
that 



P\ 



P? 



p'i 



2m i 2m i 2m 2 

In addition, momentum conservation gives 
Pi = P’l+Pi ■ 



(1.77) 



(1.78) 



Decomposing in terms of center-of-mass and relative momenta, and making use 
of the equations obtained in Sect. 1.7.3, one obtains for the initial state 



Pl — ^rP+q*, (M = mi + mi) 
mo 

p 2 = -P-q* = 0; 
y2 M H 

that is, 

P — —q* and p x = P . 

mi 

Likewise, after the collision we have 



(1.79a) 



, m 1 

P ^M P 

p' 2 = ^P-q>*=q*-q'*. 



,* m 1 

q = — q 

m2 



(1.79b) 



As the kinetic energy of the relative motion is conserved, q* and q’* have the same 
magnitude, 



, def 



q 



Let 9 and 6* denote the scattering angle in the laboratory and center-of-mass 
frames, respectively. In order to convert one into the other it is convenient to con- 
sider the quantities p | • p\ and q* ■ q'*, which are invariant under rotations. With 
p | = q*M/ni 2 and p\ = q*m\/ni 2 + q'* one has 
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M ( mi * 2 



PiPi = 

m2 \rri2 

On the other hand. 



q q = — q 

m 2 



M *2 /mi 



cos O' 



M 



Pi ■ Pi = \Pi WPi I cosy = — q 

m 2 



m i * 

— q* + q'* 
m 2 



m 2 



cos 0 



M 

= — q’ 

m 2 



*2 



N 



mi 



1 + 2 — cos # * + I — 



m 2 



m i 



m2 



cos# 



From this follows 



. mi 

cos# = ( b cos# 

m 2 



N 



mi 



1 + 2 — cos #* + I — 



m 2 



mi 



m2 



or 



sin# = sin#* 

or, finally, 
tan# = 



N 



mi 



1 + 2 — cost?* + — 



m 2 



m i 



m 2 



sint?* 



(mi/ m 2 ) + cost?* 



0 . 80 ) 



In Fig. 1.30a the target particle escapes in the direction characterized by the 
angle </> in the laboratory system. By observing that the triangle ( p/. q* , q'*) has 
two equal sides and that q* has the same direction as p j one can easily show that 
cp is related to the scattering angle in the center-of-mass system by 

7T -0* 

0=^^. (1.81) 



Several special cases can be read off the formulae (1.79) and (1.80). 

(i) If the mass m 1 of the projectile is much smaller than the mass m 2 of the 
target, m 1 <5C m 2 , then 0* ~ 0. The difference between the laboratory and center- 
of-mass frames disappears in the limit of a target that is very heavy compared to 
the projectile, 

(ii) If the masses are equal, m\ = m 2 , (1.80) and (1.81) give the relations 



<9 = <9*/2, 0 + <p — 7t /2 . 



With respect to the laboratory system the outgoing particles leave in directions 
perpendicular to each other. In particular, in the case of a central collision, 0* — n, 
and, because of q'* = —q*, 

P\ = 0 , p' 2 = pi . 

The projectile comes to a complete rest, while the target particle takes over the 
momentum of the incoming projectile. 
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1.28 Two-Particle Scattering with a Central Force: Dynamics 

Consider a scattering problem in the laboratory system sketched in Fig. 1.31. The 
projectile (1) comes in from infinity with initial momentum p l , while the target (2) 
is initially at rest. The initial configuration is characterized by the vector p l and by 
a two-dimensional vector b , perpendicular to p l , which indicates the azimuthal 
angle and the distance from the z-axis (as drawn in the figure) of the incident 
particle. This impact vector is directly related to the angular momentum: 




Fig. 1.31. Kinematics of a scattering process 
with two particles, seen from the laboratory 
system. The particle with mass m 2 is at rest 
before the scattering 



/ * m 2 m 2 
l = r x q = — r x p } = — b x = b x q . 

H M M H 



Its modulus b — \b\ is called the impact parameter and is given by 
M 1 

m2\P\\ q* 



b = 



(1.82) 



(1.83) 



If the interaction is spherically symmetric (as assumed here), or if it is axially 
symmetric about the z-axis, the direction of b in the plane perpendicular to the z- 
axis does not matter. Only its modulus, the impact parameter (1.83), is dynamically 
relevant. 

For a given potential U (r) we must determine the angle 6 into which particle 
1 will be scattered, once its momentum p i and its relative angular momentum 
are given. The general analysis presented in Sects. 1.7.1 and 1.24 tells us that we 
must solve the equivalent problem of the scattering of a fictitious particle of mass 
!± — m\ni 2 /M, subject to the potential U{r). This is sketched in Fig. 1.32. We 
have 



E 




l = b x q* . 



(1.84) 



Let P be the pericenter, i.e. the point of closest approach. Figure 1.32 shows the 
scattering process for a repulsive potential and for different values of the impact 
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Fig. 1.32. Scattering orbits in the repulsive potential 

U (r) = A/r (with A > 0). The impact parameter is 

def 

measured in units of the characteristic length X = A/E, 
with E the energy of the incoming particle. Cf. Prac- 
tical Example 1.5 



parameter. In Sect. 1.24 we showed that every orbit is symmetric with respect to 
the straight line joining the force center O and the pericenter P. Therefore, the 
two asymptotes to the orbit must also be symmetric with respect to OP 1 . Thus, 
if (po is the angle between OP and the asymptotes, we have 



e* = \n - 2 ^ 0 1 . 



The angle tpo is obtained from (1.68), making use of the relations (1.84) 



Vo 




l dr 

r 2 ^2pt(E - U(r))-! 2 /r 2 
b dr 

r 2 \J\ — b 2 / r 2 — 2 ptU(r)/q* 2 



(1.85) 



For a given U(r), (po, and hence the scattering angle 0* are calculated from this 
equation as functions of q* (i.e. of the energy, via (1.84) and of b. However, some 
care is needed depending on whether or not the connection between b and q* is 
unique. There are potentials such as the attractive 1/r 2 potential where a given 
scattering angle is reached from two or more different values of the impact pa- 
rameter. This happens when the orbit revolves about the force center more than 
once (see Example (iii) below). 

A measure for the scattering in the potential U (r) is provided by the differential 
cross section da. It is defined as follows. Let no be the number of particles incident 
on the unit area per unit time; d« is the number of particles per unit time that are 

7 The orbit possesses asymptotes only if the potential tends to zero sufficiently fast at infinity. As 
we shall learn in the next section, the relatively weak decrease 1/r is already somewhat strange. 
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scattered with scattering angles that lie between 9* and 6* + d0*. The differential 
cross section is then defined by 

d a = f — dn . (1.86) 

no 

Its physical dimension is [da] = area. 

If the relation between b(9*) and 9* is unique, then An is proportional to «o 
and to the area of the annulus with radii b and b + db, 

dn = no2Tcb{9*) db , 



and therefore 



da = 2jtb(9*)db = 2itb(d*) 



d b{6*) 
d0* 




If to a fixed 6* there correspond several values of b(6*), the contributions of all 
branches of this function must be added. 

It is convenient to refer da to the infinitesimal surface element on the unit 
sphere df2* = sin0*df)*d0* and to integrate over the azimuth <p *■ With dco = 
2 : r sin 9* dd* we then have 



da 



b{9*) 

sin0* 



d b(0*) 



d 9* 



d&> . 



(1.87) 



We study three instructive examples. 

Example (i) Scattering off an ideally reflecting sphere. With the notations of 
Fig. 1.33 



Aa 9* 

b — R sin = R cos — 

2 2 



Here we have used the relationship Aa — n — 9 * , which follows from the equality 
of the angle of incidence and the angle of reflection. Thus 




Fig. 1.33. Scattering by an ideally reflecting sphere of 
radius R 
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d b R 9* da R 2 (cos6»*/2)(sin6»*/2) R 2 

= sin — and — = = — . 

d0* 2 2 dco 2 sin$* 4 

Integrating over dco we obtain the total elastic cross section 

2 

®tot — tt R , 



a result that has a simple geometric interpretation: the particle sees the projection 
of the sphere onto a plane perpendicular to its momentum. 

Example (ii) Scattering of particles off nuclei (Rutherford scattering). The poten- 
tial is U (r) — K/r with k = qiq2, where q\ is the charge of the a-particle (this is a 
Helium nucleus, which has charge <71 = 2c). while qj is the charge of the nucleus 
that one is studying. Equation (1.85) can be integrated by elementary methods and 
one finds (making use of a good table of integrals) that 



< po — arctan 



RK ) 



or 



( 1 . 88 ) 



bq*- 

tan <f > 0 = , 

flK 

from which follows 



(1.880 



,2 K V \ 1 
b = ^ 4 — tan Vo = 



K LI 

; — COl - 



and, finally, Rutherford’s formula 



da / k \ 2 1 

d co \4 E/ sin 4 (0*/2) 



(1.89) 



This formula, which is also valid in the context of quantum mechanics, was the 
key to the discovery of atomic nuclei. It gave the first hint that Coulomb’s law is 
valid at least down to distances of the order of magnitude 10“ 12 cm. 

In this example the differential cross section diverges in the forward direction, 
9* = 0, and the total elastic cross section cr to t = / dco{da /dco) is infinite. The 
reason for this is the slow decrease of the potential at infinity. U (r) = K/r can be 
felt even at infinity, it is “long ranged”. This difficulty arises with all potentials 
whose range is infinite. 

Example (iii) Two-body scattering for an attractive inverse square potential. The 
potential is U (r) — —a/r 2 , where a is a positive constant. For positive energy 
E > 0 all orbits are scattering orbits. If I 2 > 2/ia. we have 



V - Vo 
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with fi the reduced mass, rp = ( l 2 — 2/xa)/2/uZs the distance at perihelion, and 

/p 0) = //V2/r£. If the projectile comes in along the .r-axis the solution is 

w(r) — — arcsin(rp/r) . 

V/ 2 - 2(UX K ’ 

We verify that for a — 0 there is no scattering. In this case 



(p^ 0) (r) = arcsin(rp 0, /r) , 



which means that the projectile moves along a straight line parallel to the v-axis, 
at a distance rp 0) from the scattering center. For a 0 the azimuth at rp is 



<p{r = rp) 



I 71 

i/l 2 — 2 jia 2 



Therefore, after the scattering the particle moves in the direction 
/ 



\Jl 2 — 2 fia 



7 x . 



It turns around the force center n times if the condition 

7 / \ (0) 

/ / . rp . rp \ 7'n 7l 

— I arcsin arcsin — ) = — > rnt 

yjl 2 - 2/m V oo rpj rp 2 



is fulfilled. Thus, n = rp 0) /2rp, independently of the energy. 

For / 2 < 2 (lot the integral above is (for the same initial condition) 



„( 0 ) 



b + 'Jb 2 + r 2 



<p(r) = -x— In 

b r 



where we have set b — i / (2/xoi — 1 2 )/2ixE. The particle revolves about the force 
center, along a shrinking spiral. As the radius goes to zero, the angular velocity ip 
increases beyond any limit such that the product pir 2 (p = / stays constant (Kepler’s 
second law.) 



1.29 Example: Coulomb Scattering of Two Particles 
with Equal Mass and Charge 

It is instructive to study Rutherford scattering in center-of-mass and relative coordi- 
nates and thereby derive the individual orbits of the projectile and target particles. 
We take the masses to be equal, m\ = mi = m, and the charges to be equal, 
q\ = qi = Q, for the sake of simplicity. The origin O of the laboratory system 
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Fig. 1.34. Scattering of two equally 
charged particles of equal masses, under 
the action of the Coulomb force. The hy- 
perbola branches are the orbits with re- 
spect to the center of mass. The arrows 
indicate the velocities at pericenter and 
long after the scattering with respect to 
the laboratory system 
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1 Vs*- — 1 

cos (po = — , sin ipo — . 

s s 

Thus, in the center-of-mass frame r* = = r(<p)/2, with r{ip) from (1.92). Here 

<p is the orbit parameter, its relation to the azimuth angles of particles 1 and 2 
being 



ipi — 7t — q> , ip 2 — 2tt — (p . 



This means, in particular, that the motion of the two particles on their hyperbolas 
(Fig. 1.34) is synchronous in the parameter tp. 

It is not difficult to derive the velocities V] (t) and 112 (f) of the two particles 
in the laboratory system from (1.90-92). One needs the relation dip/dt — 2 l/mr 2 . 
From this and from 



d.vi E r Id. , E t Id dip 

— J 1 (rcosipi)—, (rcos<») — , etc. 

d t V m 2d V V m 2 dip y d t 

one finds the result 

— - = 2,/— sin ip — — (2\/ s 2 — 1 — sin ip) 

df V m mp mp 

dvi / 

— = — (1 - cos ip) . 
df mp 

For 1 ) 2 (t) one obtains 

d.Y2 l dv2 / 

= — sin ip , = ( 1 — cos ip) . 

df mp d t mp 



(1.93) 



(1.94) 



Three special cases, two of which are marked with arrows in Fig. 1.34, are read 
off these formulae. 

(i) At the beginning of the motion, ip = 0: 
v, = ( 2 ^, 0 \ , v 2 = (0, 0) . 



(ii) At the peric enter, ip = ip 0 : 

l 

v\ — — 

mps 

(iii) After the scattering, ip — 2ipo: 



-((2e - 1 We 2 - 1 , s — l) . 



v 2 



mps 



{s/s 2 - 1 , -(£ - 1 )^ • 






21 (e 2 - 1) 
mps 2 



{s/ s 2 — 1, 1^ 



V2 



21 

mps- 



{\/s 2 - 1, -(e 2 - l) . 



Thus, the slope of U| is 1/Ve 2 — 1 
— Ve 2 — 1 = — tanyjQ. 



l/tan<po, while the slope of v 2 is 
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Of course, it is also possible to give the functions x; (<p) and yi(q>) in closed 
form, once t(<p) is calculated from (1.65): 

mp 2 f v Aw' 

m = t 72 • (w 

Ll J<po (1 — e cos((p' — cpo)) 

(The reader should do this.) Figure 1.35 shows the scattering orbits in the center- 
of-mass system for the case s — 2/V3, i.e. for (po = 30°, in the basis of the 
dimensionless variables 2x ( - / p and 2 yi/p. The same picture shows the positions 
of the two particles in the laboratoiy system as a function of the dimensionless 
time variable 



def 2/ 

r = t 2 • 

mp 

According to (1.95) this variable is chosen so that the pericenter is reached for 
r = 0. 




Fig. 1.35. Coulomb scattering of two particles 
(m i — m 2 , q\ = # 2 ) with ^0 = 30°. The hy- 
perbola branches are the scattering orbits in the 
center-of-mass system. The open points show 
the positions of the two particles in the labo- 
ratory system at the indicated times 
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The problem considered here has a peculiar property that one meets in asking 
where the target (particle 2) was at time t — — oo. The answer is not evident from 
the figure and one must return to (1.92). With dxi/dqi = r 2 sin (p/2p one finds 
from this equation that 

, . p r m sin<p 

-T2 («>) - *2(0) = - / ^ . ~ 2 ^ ■ 

^ Jo ( 1 — cos <p — Ve — 1 sin tpY 

This integral is logarithmically divergent. This means that in the laboratory system 
particle 2 also came from xi — — oo. This somewhat strange result gives a first 
hint at the peculiar nature of the “long-range” potential 1/r that will be met again 
in quantum mechanics and quantum field theory. 



1.30 Mechanical Bodies of Finite Extension 

So far we have exclusively considered pointlike mechanical objects, i.e. particles 
that carry a finite mass but have no finite spatial extension. In its application 
to macroscopic mechanical bodies this is an idealization whose validity must be 
checked in every single case. The simple systems of Newton’s point mechanics that 
we studied in this chapter primarily serve the purpose of preparing the ground for a 
systematic construction of canonical mechanics. This, in turn, allows the develop- 
ment of more general principles for physical theories, after some more abstraction 
and generalization. One thereby leaves the field of the mechanics of macroscopic 
bodies proper but develops a set of general and powerful tools that are useful in 
describing continuous systems as well as classical field theories. 

This section contains a few remarks about the validity of our earlier results 
for those cases where mass points are replaced with mass distributions of finite 
extension. 

Consider a mechanical body of finite extension. Finite extension means that 
the body can always be enclosed by a sphere of finite radius. Let the body be 
characterized by a time-dependent (rigid) mass density p (x ) . and let m be its total 
mass. Integrating over all space, one evidently has 

J d 3 rg(;c) = m . (1.96) 

The dimension of q is mass/(length) 3 . 

For example, assume the mass density to be spherically symmetric with respect 
to the center O. Taking this point as the origin this means that 

Q(x) = Q(r), r= f |x|- 

In spherical coordinates the volume element is 

d 2 x = sin 9 dd d 4>r 2 dr . 
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Since q does not depend on 6 and <fi , the integration over these variables can be 
carried out, so that the condition (1.96) becomes 

/»00 

4-tt I r 2 dr g(r) — m . (1-97) 

Jo 

Equation (1.96) suggests the introduction of a differential mass element 

dm = f g(;c)d 3 x . (1.98) 

In a situation where the resulting differential force d K is applied to this mass ele- 
ment, it is plausible to generalize the relation (1.8b) between force and acceleration 
as follows: 



xdm = d K . 



(1.99) 



(This postulate is due to L. Euler and was published in 1750.) We are now in a 
position to treat the interaction of two extended celestial bodies. We solve this 
problem in several steps. 

(i) Potential and force field of an extended star. Every mass element situated in x 
creates a differential potential energy for a pointlike probe of mass mp situated in 
y (inside or outside the mass distribution), given by 



at,, \ Gdmmp q(x) 3 

dU(y ) = — = -Gmp- -d x . 

\x-y\ \x - y\ 

The probe experiences the differential force 



d K = -VydU = - 



Gm 0 Q(x ) y 
\x-y \ 2 |y 



-d 3 x . 



( 1 . 100 ) 



(1.101) 



Either formula, (1.100) or (1.101), can be integrated over the entire star. For in- 
stance, the total potential energy of the mass mp is 



U(y) = —Gmp ( . 

J I*-Jl 



( 1 . 102 ) 



The vector x scans the mass distribution, while y denotes the point where the 
potential is to be calculated. The force field that belongs to this potential follows 
from (1.102), as usual, by taking the gradient with respect to y, viz. 



K(y) = -VyU(y) . 



(1.103) 



(ii) Celestial body with spherical symmetry. Let q(x) = q(s), with s = |jc|, and 
let £)(.v) = 0 for ,v > R. In (1.102) we take the direction of the vector y as the 
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Z-axis. Denoting by r — |y| the modulus of y and integrating over the azimuth </). 
we find that 



U = —2nGmo 






^ 2 di 



e(s) 



r 2 + s 2 — 2 rsz 



def _ 

z — cos 6 . 



The integral over z is elementary. 
"+i 



dz(r 2 + s 2 — 2rsz) 

One sees that U is spherically symmetric, too, and that it is given by 



1/9 = - — [|r-j|-(r + s)] = 
rs 



2/r for 
2/s for 



- f s 2 ds e (s) + [ 
r Jo Jr 



U (r) = —4n Gmo [ - j s~dsg(s) + / .vdsgO) ) . 



r > s 
r < s 



(1.104) 



For r > R the second integral does not contribute, because q(s) vanishes for 
s > R. The first integral extends from O to R and, from (1.97), gives the constant 
m/4n. Thus one obtains 



Gmam 

U(r ) = — 

r 



for r > R . 



(1.105) 



In the space outside its mass distribution a spherically symmetric star with 
total mass m creates the same potential as a mass point m placed at its 
center of symmetry. 



It is obvious that this result is of great importance for the application of Kepler’s 
laws to planetary motion. 

(iii) Interaction of two celestial bodies of finite extension. If the probe of mass mo 
has a finite extension, too, and is characterized by the mass density Qo(y), (1.102) 
is replaced by the differential potential 

d U(y) = — G£>o(j)d 3 .v f Q< ' X) d 3 x . 

J I*-Jl 

This is the potential energy of the mass element £>(_y)d 3 y in the field of the first 
star. The total potential energy is obtained from this by integrating over y : 



G = -G /V* / d 3 / Wg0O,) . 

J J \x-y\ 



(1.106) 



If both densities are spherically symmetric, their radii being R and Rq, we obtain 
again (1.105) whenever the distance of the two centers is larger than (R + Rq). 

(iv) Potential of a star with finite extension that is not spherically symmetric. As- 
sume the density q(x) still to be finite (that is, q(x) = 0 for |x| > R) but not 
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necessarily spherically symmetric. In calculating the integral (1.102) the following 
expansion of the inverse distance is particularly useful: 



1 



\x-y\ 



oo 



4?r E 

1=0 



1 ri 

21 + 1 



E ■ 



ii=—l 



(1.107) 



Here r< = |jc|, r> = |y| if |_y| > |jc|, and correspondingly r< = |y|, r > = |x| 
if |y| < |jc|. The symbols F/^ denote well-known special functions, spherical 
harmonics, whose arguments are the polar angles of x and y: 

(6.x, 4>x) =x , (9 y , 4> y ) = y . 

These functions are normalized and orthogonal in the following sense: 

J sin 9 d9 dtpY^O, 0)F/ V (6>, <p) = 8^8^ (1.108) 

(see e.g. Abramowitz, Stegun 1965). Inserting this expansion in (1.102) and choos- 
ing |_y| > R, one obtains 



4 TT 



+/ 






1=0 



21 



1 * — ' r‘ 

fl=—l 



(1.109) 



where 



def 
Wii = 



/ 



d 3 xY* ll (x)s l q ( x ) . 



( 1 . 110 ) 



The first spherical harmonic is a constant: F/ = on=0 — 1 If Q(x) is taken 

to be spherically symmetric, one obtains 

» R (*n p2i r 

A V_ _ V 

Ifl 



/* K /» 7 T /» ATI 

qi\i — V^rt / s 2 dss l g(s ) / sin 0 d0 / d^FooI/* 
70 70 70 

f R 2 m 

/ s dsgC^ci/o^o = —y=8io8/j,o > 

Jo V47T 



= V47T 



so that (1.109) leads to the result (1.105), as expected. The coefficients qq, are 
called multipole moments of the density q ( x ). The potentials that they create. 



4 nqq, 






(l.iii) 



are called multipole potentials. In the case of spherical symmetry only the multipole 
moment with / = 0 is nonzero, while in the absence of this symmetry many or all 
multipole moments will contribute. 
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Let us return to the n -particle system as described by the equations of motion 
(1.28). We assume that the system is closed and autonomous, i.e. that there are 
only internal, time-independent forces. We further assume that these are potential 
forces but not necessarily central forces. For just n = 3, general solutions of the 
equations of motion are known only for certain special situations. Very little is 
known for more than three particles. Therefore, the following approach is useful 
because it yields at least some qualitative information. 

We suppose that we know the solutions r t (t). and therefore also the momenta 
Pi(t) — mrj(t). We then construct the following mapping from phase space onto 
the real numbers: 

n 

v(t)= J2 r i(t) ■ Pi(t) . ( 1 . 112 ) 

1 = 1 



This function is called the virial. If a specific solution has the property that no 
particle ever escapes to infinity or takes on an infinitely large momentum, then 
v(t) remains bounded for all times. Defining time averages as follows: 



{f) d £ Hm _L [ 

' A^oo2Aj_ 



+A 






(1.113) 



-A 



the average of the time derivative of v(t) is then shown to vanish, viz. 

if 

2A J_. 



(v) — lim — 
zi-s-oo 2 A 



" +A , du(f) v(A) - v(-A) 

at = lim = 0 . 

1—A df A-*oo 2 A 



Since 



n n 

v(t ) = 'Y^m i r' 1 i {t) -^rtit) ■ V,T/(ri(f), ... , r„(t )) , 
1 = 1 1 = 1 



we obtain for the time average 





(1.114) 



This result is called the virial theorem. It takes a particularly simple form when U 
is a homogeneous function of degree k in its arguments r \ , . . . , r„. In this case 
A r i ■ V, U — kU, so that (1.114) and the principle of energy conservation give 



2{T)-k{U) = 0, (T) + (U) — E . 



(1.115) 



Examples of interest follow. 
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(i) Two-body systems with harmonic force. Transforming to center-of-mass and 
relative coordinates, one has 

v(t) = m\r\ ■ r\ + iri2r2 • i "2 = Mr s • fs + fir ■ r . 

The function v remains bounded only if the center of mass is at rest, r s = 0. 
However, the kinetic energy is then equal to the kinetic energy of the relative 
motion so that (1.115) applies to the latter and to U (r). In this example U (r) = 
ar 2 , i.e. k = 2. The time averages of the kinetic energy and potential energy of 
relative motion are the same and are equal to half the energy, 

(T) = ( U ) = \E . 

(ii) In the case of the Kepler problem the potential is U(r) = — a/r , where r de- 
notes the relative coordinate. Thus k — — 1. For E < 0 (only then is v(t) bounded) 
one finds for the time averages of kinetic and potential energies of relative motion 

(T) = -E ; (U) = 2 E . 

Note that this is valid only in R 3 \{0} for the variable r. The origin where the force 
becomes infinite should be excluded. For the two-body system this is guaranteed 
whenever the relative angular momentum is nonzero. 

(iii) For an /i -particle system (n > 3) with gravitational forces some information 
can also be obtained. We first note that v(t) is the derivative of the function 

u;(r) d = f ^2 ^ mirf(t ) , 
i=l 

which is bounded, provided that no particle ever escapes to infinity. As one can 
easily show, 

w(t ) = 2 T + U = E + T . 

Since T(t) is positive at all times, w ( 1 ) can be estimated by means of the general 
solution of the differential equation y(t) — E. Indeed 

w(t) > ^Et 2 + w(0)t + w(0) . 

If the total energy is positive, then lim^ioo w(t) = 00 , which means that at 
least one particle will escape to infinity asymptotically (see also Thirring 1992, 
Sect. 4.5). 
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Appendix: Practical Examples 



1. Kepler Ellipses. Study numerical examples for finite motion of two celestial 
bodies in their center-of-mass frame (Sect. 1.7.2). 

Solution. The relevant equations are found at the end of Sect. 1.7.2. It is convenient 
to express m i and m 2 in terms of the total mass M — m \ + m 2 and to set M — 1. 
The reduced mass is then /x = in \mi. For given masses the form of the orbits is 
determined by the parameters 



P 



A/x 



and 




(A.l) 



which in turn are determined by the energy E and the angular momentum. It is 
easy to calculate and to draw the orbits on a PC. Figure 1.6a shows the example 
m 1 = m 2 with s — 0.5, p — 1, while Fig. 1.6b shows the case m\ — ni2/9 with 
e = 0.5, p = 0.66. As the origin is the center of mass, the two stars are at opposite 
positions at any time. 



2. Motion of a Double Star. Calculate the two orbital ellipses of the stars of the 
preceding example pointwise, as a function of time, for a given time interval At. 



Solution. In Example 1 the figures show r(<p) as a function of q>. They do not 
indicate how the stars move on their orbits as a function of time. In order to obtain 
r(t), one returns to (1.19) and inserts the relative coordinate r(cp). Separation of 
variables yields 



/x/ 2 2 /’^"+ 1 d cp 

I J<i>„ (1+scos cp) 2 



(A. 2) 



for the orbital points n and n+ 1. (The pericenter has (p\> — 0.) The quantity /x/; 2 // 
has the dimension of time. Introduce the period from (1.23) and use this as the 
unit of time. 



/x 1 /V/ 2 _ A/x 1 / 2 

T ~~ Jt A l/2 _7r 2 1 /2(_£)3/2 • 



Then 



PP 2 

l 



(1 



- , 2 ) 3/2 



T 

2tc 



The integral in (A.2) can be done exactly. Substituting 



def 





one has 
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2 f 1 
= / dx — 

71^2 J 



[ dtp _ 2 f, 1 +[(1 +£)/(! -£)]x 2 

J (l+ecos<p) 2 Vl - s 2 J X (1+x 2 ) 2 

2 r dx 2s C x 2 dx 1 

Vl - £ 2 J 1+X 2+ i - S J (1 + A' 2 ) 2 J 

whose second term can be integrated by parts. The result is 



2 f dx 2s f x 2 dx 1 

Vl - £- J 1 + X 2+ 1-SJ (1 +X 2 ) 2 j 



' = uiw are,an 



' 1 — e w 

tan— 

1 + e 2 



1 — s 2 1 + e cos q> 



so that 



tn + 1 h 

T 



1 = ^-[“ (/r 



— £ W 

tan— 

“I” £ 2 



1 • “1 Y'tt 

1 /, 2 Sm ^ 

--eVl -e 2 — 

2 1 + £ cos cp J ^ 



(A. 3) 



(A.4) 



One can compute the function At(A(p, (p), for a fixed increment A<p and mark 
the corresponding positions on the orbit. Alternatively, one may give a fixed time 
interval At /T and determine succeeding orbital positions by solving the implicit 
equation (A.4) in terms of (p. 



3. Precession of Perihelion, (a) For the case of bound orbits in the Kepler problem 
show that the differential equation for <p — qi{r) takes the form 

dcp 1 
dr r 

where rp and r\ denote pericenter and apocenter, respectively. Integrate this equa- 
tion with the boundary condition q>{r — rp) = 0. 

(b) The potential is now modified into U(r) — —A/r + B/r 2 . Determine the 
solution i p = (p(r) and discuss the precession of the pericenter after one turn, in 
comparison with the Kepler case, as a function of B ^ 0 where | Z? | <§; / 2 /2/x . 

Solutions, (a) For elliptical orbits, E < 0, and one has 

dip 1 1 

dr s /2ix{—E) r 





Apocenter and pericenter are given by the roots of the quadratic form (— r 2 — 
Ar/E + l 2 /2fiE): 



Dyp = 



P 

1 =F£ 



A 
2 E 



(1±£) 



(A. 6) 
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(these are the points where dr/dr = 0). With 

A 2 . 2 \ I 2 

rpVA ~ 4E 2 ^ “ £ ^ _ ~2 fiE 

we obtain (A. 5). This equation can be integrated. With the condition (p(rp) — 0 
one obtains 



(p{r) — arccos 



1 

r A - rp 



jAr p 




(A. 7) 



As <p ( r,\ ) — <p(rp) = Ti, one confirms that the pericenter, force center, and apocen- 
ter lie on a straight line. Two succeeding pericenter constellations have azimuths 
differing by 2n, i.e. they coincide. There is no precession of the pericenter. 

(b) Let rp and r,.\ be defined as in (A. 6). The new apocenter and pericenter 
positions, in the perturbed potential, are denoted by r' A and r' p , respectively. One 
has 



( r ~ r p)( r A - r ) + f = ( r - r p)( r A ~ ') - 

and therefore 

'p'a = r P''A - % ■ (A.8) 

Equation (A. 5) is modified as follows: 



d (p 1 


/ rprA 


1 rpr A 


1 >Va 


dr r Y 


1 

1 

TL 

1 


r F r A\ 


1 

1 



This equation can be integrated as before under (a): 



<P(r) = 



rpr A 



-arccos 



'p' A 



L A 



(A. 9) 



From (A.8) two successive pericenter configurations differ by 



2i r 



PP^A 



2ttI 

J1 2 + 2ixB ' 



(A. 10) 



This difference can be studied numerically, as a function of positive or negative B. 
Positive B means that the additional potential is repulsive so that, from (A. 10), the 
pericenter will “stay behind”. Negative B means additional attraction and causes 
the pericenter to “advance”. 
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4. Rosettelike Orbits. Study the finite orbits in the attractive potential U ( r ) = 
a/r a , for some values of the exponent a in the neighborhood of a = 1. 



Solution. Use as a starting point the system (1.70' -71') of first-order differential 
equations, written in dimensionless form: 



dp 

dr 



= b Q -“ - Q- 2 - 2 = fie ) , 



dtp 

dr 




(A. 11) 



From this calculate the second derivatives: 



d 2 p d /dp\ dp 

dr 2 dp \dr J dr 



^■(1 -bag 2 “)= f g(p) , 



d 2 p 

dr 2 




f(Q ) ■ 



Equation (A. 11) can be solved approximately by means of simple Taylor series: 
01+1 = Qn + hf(e n ) + \h 2 g(g n ) + 0(h 3 ) , 

<Pn+ 1 = <Pn + - ^t 2 — 3/(01) + 0(h 3 ) , (A. 12) 

Qn Qn 

for the initial conditions ro = 0, p(0) = Rq , ^>(0) = 0. The step size h for the 
time variable can be taken to be constant. Thus, if one plots the rosette pointwise, 
one can follow the temporal evolution of the motion. (In Figs. 1.18-22 we have 
chosen h to be variable, instead, taking h — hog/Ro, with /zq = 0.02.) 



5. Scattering Orbits for a Repulsive Potential. A particle of fixed momentum 
p is scattered in the field of the potential U (r) = A/r, (with A > 0). Study the 
scattering orbits as a function of the impact parameter. 

Solution. The orbit is given by 



r = r(<p) 



1 + s COS {(p — <Pq) 



(A. 13) 



with e > 1. The energy E must be positive. We choose <pu = 0 and introduce the 

dcf 

impact parameter b — l/\p\ and the quantity X — A/E as a characteristic length 
of the problem. The equation of the hyperbola (A. 13) then reads 

r((f) 2b 2 /X 2 



l + yi+4 b 2 /X 2 COS (p 



(A. 13') 



Introducing Cartesian coordinates (see Sect. 1.7.2), we find that (A. 130 becomes 

Ax 2 y 2 

- ti 2 = 1 ■ 

This hyperbola takes on a symmetric position with respect to the coordinate axes, 
its asymptotes having the slopes tan<po and — tan<po. respectively, where 



tpo — arctan 




(A. 14) 
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We restrict the discussion to the left-hand branch of the hyperbola. We want the 
particle always to come in along the same direction, say along the negative x-axis. 
For a given impact parameter b this is achieved by means of a rotation about the 
focus on the positive x-axis, viz. 

u — (x — c) cos (po + y sin (po , 

v = — (x — c) sin <po + y cos (po , (A. 15) 

where c — yj 1 + 4b 2 /X 2 /2 is the distance of the focus from the origin, and 
y = ±b^/4x 2 /X 2 — 1. For all b, the particle comes in from — oo along a direction 
parallel to the w-axis, with respect to the coordinate system (w, i>). Starting from 
the pericenter (xq/X — — j, yo = 0), let y run upwards and downwards and use 
(A. 15) to calculate the corresponding values of x and y (see Fig. 1.32). 



6. Temporal Evolution for Rutherford Scattering. For the example in Sect. 1.29 
calculate and plot a few positions of the projectile and target as a function of time, 
in the laboratory system. 

Solution. In the laboratory system the orbits are given, as functions of <p, by (1.90- 
92). With <p\ — n — <p, (p 2 = 2jt — <p 



1 E r p 1 

ri=r s + -r = J— r( 1,0) + r 

2 V m 2 s cos (<p — cpo) — 1 

1 /~£V" P 1 

r 2 = r s r = ,/ — r ( 1 , 0 ) 

2 V m 2 s cos(tp — ipo) — 1 



(— cos cp , sin (p ) , 

(cos^), — sin<p) . (A. 16) 



The integral (1.95) that relates the variables t and <p is calculated as in Example 2. 
Noting that here s > 1 and making use of the formulae 



1 1 + ix 

arctan x = — In 

2 1 — be 

we find that 



ix mp / n r 



E r. ( , P 
— t(<p) = ~ 
m 2 



In 



1 , 



sin(0> - ipo) 



s 2 — 1 1 — u Ve 2 — 1 e cos (<p — <po) — 1 



(A. 17) 



where u stands for the expression 



Is + 1 rp - (po 

u = ,/ tan . 

s- 1 2 

Furthermore, we have 

1 Ve 2 — 1 

cos epo — - , sin (po = , 

s s 

<p — (po sin (p — sin (po s sin <p — Ve 2 — 1 



tan 



COS (p + COS (Po 



2 



1 + e cos q> 
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1. Elementary Newtonian Mechanics 



Equation (A. 17) gives the relation between q> and t . Using dimensionless coor- 
dinates (2 x/p, 2y / p). one plots points for equidistant values of <p and notes the 
corresponding value of the dimensionless time variable 



^ def 2 E r 

p V m 



Figure 1.35 shows the example s — 0.155, (po — 30°. Alternatively, one may 
choose a fixed time interval with respect to t((po) = 0 and calculate the corre- 
sponding values of ( p from (A. 17). 




2. The Principles of Canonical Mechanics 



Canonical mechanics is a central part of general mechanics, where one goes be- 
yond the somewhat narrow framework of Newtonian mechanics with position co- 
ordinates in the three-dimensional space, towards a more general formulation of 
mechanical systems belonging to a much larger class. This is the first step of ab- 
straction, leaving behind ballistics, satellite orbits, inclined planes, and pendulum- 
clocks; it leads to a new kind of description that turns out to be useful in areas 
of physics far beyond mechanics. Through d’Alembert’s principle we discover the 
concept of the Lagrangian function and the framework of Lagrangian mechanics 
that is built onto it. Lagrangian functions are particularly useful for studying the 
role symmetries and invariances of a given system play in its description. By means 
of the Legendre transformation we are then led to the Hamiltonian function, which 
is central to the formulation of canonical mechanics, as developed by Hamilton 
and Jacobi. 

Although these two frameworks of description at first seem artificial and un- 
necessarily abstract, their use pays in very many respects: the formulation of me- 
chanics over the phase space yields a much deeper insight into its dynamical and 
geometrical structure. At the same time, this prepares the foundation and formal 
framework for other physical theories, without which, for example, quantum me- 
chanics cannot be understood and perhaps could not even be formulated. 



2.1 Constraints and Generalized Coordinates 

2.1.1 Definition of Constraints 

Whenever the mass points of a mechanical system cannot move completely in- 
dependently because they are subject to certain geometrical conditions, we talk 
about constraints. These must be discussed independently because they reduce the 
number of degrees of freedom and therefore change the equations of motion. 

(i) The constraints are said to be holonomic (from the Greek: constraints are 
given by an “entire law”) if they can be described by a set of independent equations 
of the form 



fi(ri,r 2 , ... , r„, t) = 0 ; X = 1, 2, . . . , A . 



( 2 . 1 ) 
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Fig. 2.1. A system of three mass points at constant distances 
from one another has six degrees of freedom (instead of nine) 



Independent means that at any point (r 1 , . . . , r n ) and for all t, the rank of the 
matrix {dfx/drx} is maximal, i.e. equals A. As an example take the three-body 
system with the condition that all interparticle distances be constant (see Fig. 2.1): 

fl = ki ~r 2 \ ~a 3 = 0 , 

h = \ r 2 - r 2 \ - a\ = 0 , 
h = k 3 -r l \-a 2 = 0 . 

Here A — 3. Without these constraints the number of degrees of freedom would 
be / = 3 n — 9. The constraints reduce it to / = 3n — A — 6. 

(ii) The constraints are said to be nonholonomic if they take the form 

n 

'Y^co l k {r \, . . . , r n ) ■ dr k = 0 , i = 1 ,A (2.2) 

k= l 

but cannot be integrated to the form of (2.1). Note that (2.1), by differentiation, 
gives a condition of type (2.2), viz. 

n 

^2^kfx(r l, .... r„) ■ dr k = 0 . 
k= l 

This, however, is a complete differential. In contrast, a nonholonomic constraint 
(2.2) is not integrable and cannot be made so by multiplication with a function, a 
so-called integrating factor. This class of conditions is the subject of the analysis 
of Pfaffian systems. As we study only holonomic constraints in this book, we do 
not go into this any further and refer to the mathematics literature for the theory 
of Pfaffian forms. 

(iii) In either case one distinguishes constraints that are (a) dependent on time 
- these are called rheonomic (“running law”) constraints', and (b) independent of 
time - these are called scleronomic (“rigid law”) constraints. 

(iv) There are other kinds of constraints, which are expressed in the form of 
inequalities. Such constraints arise, for instance, when an «-particle system (a gas, 
for example) is enclosed in a vessel: the particles move freely inside the vessel but 
cannot penetrate its boundaries. We do not consider such constraints in this book. 
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2.1.2 Generalized Coordinates 

Any set of independent coordinates that take into account the constraints are called 
generalized coordinates. For example, take a particle moving on the surface of a 
sphere of radius R around the origin, as sketched in Fig. 2.2. Here the constraint is 
holonomic and reads x 2 + y 2 + z 2 — R 2 , so that / = 3n — 1 = 3 — 1 = 2. Instead 
of the dependent coordinates {x, y, z} one introduces the independent coordinates 

qi = 9 , q 2 = <p. 



Fig. 2.2. A particle whose motion is restricted to the surface of a sphere 
has only two degrees of freedom 

In general, the set of 3n space coordinates of the n -particle system will be 
replaced by a set of (3 n — A) generalized coordinates, viz. 

{ri,r2,...r„}^{qi,q2,...,qf}, f — 1, 2, . . . , 3n - A , (2.3) 

which, in fact, need not have the dimension of length. The aim is now twofold: 

(i) determine the number of degrees of freedom / and find / generalized coor- 
dinates that take account of the constraints automatically and that are adapted, 
in an optimal way, to the system one is studying; 

(ii) develop simple principles from which the equations of motion are obtained 
directly, in terms of the generalized coordinates. 

We begin by formulating d’Alembert’s principle, which is an important auxiliary 
construction on the way to the goal formulated above. 




2.2 D’Alembert’s Principle 

Consider a system of n mass points with masses {/«,} and coordinates {/•,}, 
i = 1,2 subject to the holonomic constraints 

fx(ru . . . , r n , t) = 0 , X=l,...,A. (2.4) 



2.2.1 Definition of Virtual Displacements 

A virtual displacement {<5r,-} of the system is an arbitrary, infinitesimal change 
of the coordinates that is compatible with the constraints and the applied forces 1 . 

1 Here we make use of this somewhat archaic but very intuitive notion. Geometrically speaking, 
virtual displacements are described by tangent vectors of the smooth hypersurface in R 3 " that is 
defined by (2.4). D’Alembert’s principle can and should be formulated in the geometric framework 
of Chap. 5. 
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It is performed at a fixed time and therefore has nothing to do with the actual, 
infinitesimal motion {dr,} of the system during the time change d t (i.e. the real 
displacement). 

Loosely speaking, one may visualize the mechanical system as a half-timbered 
building that must fit in between the neighboring houses, on a given piece of land 
(these are the constraints), and that should be stable. In order to test its stability, 
one shakes the construction a little, without violating the constraints. One imagines 
the elements of the building to be shifted infinitesimally in all possible, allowed 
directions, and one observes how the construction responds as a whole. 

2.2.2 The Static Case 

To begin with, let us assume that the system is in equilibrium, i.e. F, = 0, 
i — 1 where F; is the total force applied to particle i. Imagine that the 
constraint is taken care of by applying an additional force Z, to every particle i 
(such forces are called forces of constraint). Then 

F, = Kj + Z, , (2.5) 

where Z, is the force of constraint and K, the real, dynamic force. Clearly, because 
all the F,- vanish, the total virtual work vanishes: 

n n 

Fi • Sr, = 0 = + Z«] • Sri ■ (2.6) 

i = l i = 1 

However, since the virtual displacements must be compatible with the constraints, 
the total work of the forces of constraint alone vanishes, too: Yl'i %i ' Sr, — 0. 
Then, from (2.6) we obtain 

n 

Ki • Sr, = 0 . (2.7) 

1=1 

In contrast to (2.6), this equation does not imply, in general, that the individual 
terms vanish. This is because the Sr, are generally not independent. 

2.2.3 The Dynamical Case 

If the system is moving, then we have F; — p, — 0 and of course also ^T" =| ( F, — 
fi) ■ Sr j = 0. As the total work of the forces of constraint vanishes, ^" =1 (Z,- • 
Sri) — 0, we obtain the basic equation expressing d’Alembert’s principle of virtual 
displacements: 

n 

- Pi)-Sri = 0 , 

1=1 



(2.8) 
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from which all constraints have disappeared. As in the case of (2.7), the individual 
terms, in general, do not vanish, because the <5r,- depend on each other. 

Equation (2.8) is the starting point for obtaining the equations of motion for 
the generalized coordinates. We proceed as follows. 

As the conditions (2.4) are independent they can be solved locally for the co- 
ordinates r,, i.e. 



n — n(qi, qf,t) , i = 1, • • • , n , f = 3n — A . 

From these we can deduce the auxiliary formulae 



Vi = Yj 



E 



k= 1 



3 Y-, . 

a — qk + 

oqk 



dYj 

3 1 ’ 



(2.9) 



dVj _ dn_ 
dq k 3 qt ' 



§Yj = E_ 8q k 



( 2 . 10 ) 

(2.11) 



Note that there is no time derivative in (2.11), because the Hy, are virtual displace- 
ments, i.e. are made at a fixed time. The first term on the left-hand side of (2.8) 
can be written as 



n f n d y- 

Ki ■ SYi = Qk8q k With Q k = ■ (2-12) 

i i—i ;_i 



The quantities Q k are called generalized forces (again, they need not have the di- 
mension of force). The second term also takes the form ■ -}8q k , as follows: 



n n 

Pi ■ Sri = mj'fj ■ Sri 

: 1 i'=l 

The scalar products (r, • 3 r,- /3 q k ) 

drj d /. 3 rj\ 

3 q k dr \ 3 q k ) 

Note further that 

d 3 r, 3 . 3i >i 

dr 3 q k 3 q k ' dq k 

and, on taking the partial derivative of (2.9) with respect to q k , that drt/dq k — 
dvi/dq k . 



f 



3 Yi 



E m ‘ E ' fi ■ E 8qk ■ 
tt ti dqk 



can be written as 



3 T7 
dr 3 q k 
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From these relations one obtains 



n n 

E d r i 

m [ r [ = > 



d / dvj\ dvj 

df V 3 q k ) dq k 



The two terms in the expression in curly brackets contain the form v ■ dv/dx — 
(dv 2 /dx)/2, with x = q k or q k , so that we finally obtain 



n f 

Pi • Sri = 



i = 1 



k= 1 



d 
d t 



d 

dqk 




9 

dq k 




Sqk . 



(2.13) 



Inserting the results (2.12) and (2.13) into (2.8) yields an equation that contains 
only the quantities Sq k , but not <5r,-. The displacements 8q k are independent (in 
contrast to the 9r ( , which are not). Therefore, in the equation that we obtain from 
(2.8) by replacing all 8r, by the 8q k , as described above, every term must vanish 
individually. Thus we obtain the set of equations 



d / dT 
dr \dq k 



dT 

dqk 



= Qk , k= 1 / 



(2.14) 



where T — Y^Ji=i lr, i v j/2 is the kinetic energy. Of course, very much like Q k , T 
must be expressed in terms of the variables qi and qi so that (2.14) really does 
become a system of differential equations for the q k {t). 



2.3 Lagrange’s Equations 



Suppose that, in addition, the real forces K , are potential forces, i.e. 
K, = -V,U . 



(2.15) 



In this situation the generalized forces Qk are potential forces, also. Indeed, from 

(2.12) 



Qk = - ^2 ViU(ri, r n ) 



i = 1 



3 n 

dq k 



-g — U(qi ,q/,t ), (2.16) 

dqk 



under the assumption that U is transformed to the variables q k . As U does not 
depend on the q k , T and U can be combined to 



L(q k , qk, t) — T (q k , q k ) - U(q k , t ) 

so that (2.14) takes the simple form 



(2.17) 



d / 8L \ dL 

dr \dq k ) dq k 



(2.18) 
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The function L(qk,q k ,t) is called the Lagrangian function. Equations (2.18) are 
called Lagrange’s equations. They contain the function L (2.17) with 



U(qi, . . . , qq, 



t) = U(r\(q\ , q f , t) 

f 



T(q k ,qk) = 



i=l 



v - ^ dr j . 

<lk 

/ 




(2.19) 



= a 



ckiqkqi 



k= 1 



k= 1 /=1 



where 



;=i 



1 / dr j 

- 9 ("aT" 



. V- 3 n 3 n 
bk — / • 

3 q k dt 



(2.20) 



Ckl 



1 A 3r, 3r, 

= - > in, • . 

9 dq k dqi 



i=l 



The special form L — T — U of the Lagrangian function is called its natural 
form. (For the reasons explained in Sect. 2.11 below, L is not unique.) For scle- 
ronomic constraints both a and all b k vanish. In this case T is a homogeneous 
function of degree 2 in the variables q k . 

We note that d’Alembert’s equations (2.14) are somewhat more general than 
Fagrange’s equations (2.18): the latter follow only if the forces are potential forces. 
In contrast, the former also hold if the constraints are formulated in a differential 
form (2.11) that cannot be integrated to holonomic equations. 



2.4 Examples of the Use of Lagrange’s Equations 

We study three elementary examples. 

Example (i) A particle of mass m moves on a segment of a sphere in the earth’s 
gravitational field. The dynamical force is K — (0, 0, —mg), the constraint is |r| = 
R, and the generalized coordinates may be chosen to be qi — 9 and c /2 = <p. as 
shown in Fig. 2.3. The generalized forces are 

dr 

Qi = K ■ = — R K z sin 9 = R m g sin 9 , 

dq\ 

Q 2 = 0. 

These are potential forces, Q\ — —dU/dqi, Qi — —dU/dq 2 , with U (q \ , q 2 ) — 
mgR[ 1 T cos qf\. Furthermore, T = mRr[q ^ + qg sin 2 q{\/2, and therefore 

L = \mR 2 [ql + q \ sin 2 ^i] — mg /? [ 1 + cos^i] . 
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Fig. 2.3. A small ball on a segment of a sphere 
in the earth’s gravitational field. The force of 
constraint Z is such that it keeps the particle 
on the given surface. As such it is equivalent 
to the constraint 



Fig. 2.4. Two pointlike weights m\ and m 2 are 
connected by a massless thread, which rests on 
a wheel. The motion is assumed to be friction- 
less 



We now calculate the derivatives dL/dqi and d(dL/dq/)/dt: 

£-o. 

aq 2 



dL 



= m R 2 q 2 sin <71 cos q\ + m g R sin q\ 

d L dL 9. . 2 

— — = m R"q\ , — — = m R"q2 sin q\ 

dq 1 9<?2 

to obtain the equations of motion 

[ 2 § 2 d 2 

q 2 cos qi H sin^i = 0 , m R~ — (</2sin“gi) = 0 . 

R J dt 



Example (ii) Atwood’s machine is sketched in Fig. 2.4. The wheel and the thread 
are assumed to be massless; the wheel rotates without friction. We then have 

T = j (mi + t>i2)x 2 , 

U = —ntigx — m 2 g(l — 3 ') , 

L = T — U . 

The derivatives of L are dL/dx = (m 1 — ni 2 )g, dL/dx = (m\ + ni 2 )x, so that 
the equation of motion d(9L/9i)/dt = dL/dx becomes 

m 1 — mi 

X = ; g ■ 

m 1 + m 2 

It can be integrated at once. If the mass of the wheel cannot be neglected, its 
rotation will contribute to the kinetic energy T by the amount T = I (dd /dt) 2 /2, 
where I is the relevant moment of inertia and dd /dt its angular velocity. Let R be 
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the radius of the wheel. The angular velocity is proportional to Jc, viz. R(d9/dt) — 
x. Therefore, the kinetic energy is changed to T — (mi + m 2 + I / R 2 )x 2 / 2. (The 
rotary motion of a rigid body such as this wheel is dealt with in Chap. 3.) 

Example (iii) Consider a particle of mass m held by a massless thread and rotating 
about the point S, as shown in Fig. 2.5. The thread is shortened at a constant rate 
c per unit time. Let x and y be Cartesian coordinates in the plane of the circle, 
(p the polar angle in that plane. The generalized coordinate is q = <p, and we 
have x = (Rq — ct) cos q, y = ( Rq — ct)sinq; thus T = m{x 2 + y 2 )/ 2 = 
m[cj 2 (R() — ct) 2 + c 2 ]/2. In this example T is not a homogeneous function of 
degree 2 in q (the constraint is rheonomic!). The equation of motion now reads 
mq(Ro — ct) 2 — const. 



m 




Fig. 2.5. The mass point m rotates about the point S. At the 
same time, the thread holding the mass point is shortened con- 
tinuously 



2.5 A Digression on Variational Principles 

Both conditions (2.7) and (2.8) of d’Alembert’s principle, for the static case and 
dynamic case respectively, are expressions for an equilibrium: if one “shakes” the 
mechanical system one is considering, in a way that is compatible with the con- 
straints, the total (virtual) work is equal to zero. In this sense the state of the system 
is an extremum; the physical state, i.e. the one that is actually realized, has the 
distinct property, in comparison with all other possible states one might imagine, 
that it is stable against small changes of the position (in the static case) or against 
small changes of the orbits (in the dynamic case). Such an observation is familiar 
from geometric optics. Indeed, Fermat’s principle states that in an arbitrary system 
of mirrors and refracting glasses a light ray always chooses a path that assumes 
an extreme value. The light’s path is either the shortest or the longest between its 
source and the point where it is detected. 

D’Alembert’s principle and the experience with Fermat’s principle in optics 
raise the question whether it is possible to define a functional, for a given me- 
chanical system, that bears some analogy to the length of path of a light ray. The 
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actual physical orbit that the system chooses (for given initial condition) would 
make this function an extremum. Physical orbits would be some kind of geodesic 
on a manifold determined by the forces; that is, they would usually be the shortest 
(or the longest) curves connecting initial and final configurations. 

There is indeed such a functional for a large class of mechanical systems: the 
time integral over a Lagrangian function such as (2.17). This is what we wish to 
develop, step by step, in the following sections. 

In fact, in doing so one discovers a gold mine: this extremum, or variational, 
principle can be generalized to field theories, i.e. systems with an infinite number 
of degrees of freedom, as well as to quantized and relativistic systems. Today it 
looks as though any theory of fundamental interactions can be derived from a vari- 
ational principle. Consequently it is rewarding to study this new, initially somewhat 
abstract, principle and to develop some feeling for its use. This effort pays in that 
it allows for a deeper understanding of the rich structure of classical mechanics, 
which in turn serves as a model for many theoretical developments. 

One should keep in mind that philosophical and cosmological ideas and con- 
cepts were essential to the development of mechanics during the seventeenth and 
eighteenth centuries. It is not surprising, therefore, that the extremum principles 
reflect philosophical ideas in a way that can still be felt in their modern, somewhat 
axiomatic, formulation. 

The mathematical basis for the discussion of extremum principles is provided 
by variational calculus. In order to prepare the ground for the following sections, 
but without going into too much detail, we discuss here a typical, fundamental 
problem of variational calculus. It consists in finding a real function y(x) of a real 
variable x such that a given function / [y] of this function assumes an extreme 
value. Let 

/[>’] = [ d x f(y(x),y\x),x) , y'(x) = y(x) (2.21) 

Jxi dx 

be a functional of y, with / a given function of y, y' (the derivative of y with 
respect to x), and the variable x. x\ and xi are arbitrary, but fixed, endpoints. The 
problem is to determine those functions y(x ) which take given values yi = y(xi) 
and y 2 — y{x 2 ) at the endpoints and which make the functional 7[y] an extremum. 
In other words, one supposes that all possible functions y(x) that assume the given 
boundary values are inserted into the integral (2.21) and that its numerical value 
is calculated. What we are looking for are those functions for which this value 
assumes an extremal value, i.e. is a maximum or a minimum, or, possibly, a saddle 
point. 

As a first step we investigate the quantity 

. f r*2 

1(a) = I f(y(x, a), y'(x, a), x) dx , (2.22) 

Jxi 

where y(x, a) = y(x)+ar](x) with ri(x 1 ) = 0 = rj(x 2 ). This means that we embed 
y (x ) in a set of comparative curves that fulfill the same boundary conditions as 
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v (x ) . Figure 2.6 shows an example. The next step is to calculate the so-called 
variation of I, that is, the quantity 



def d/ 

SI = — da = 
da 




9/ dy 
dy da 



dfd/ 

dy' da 



da . 



Clearly, dy' /da — (d/dx)(dy/da). If the second term is integrated by parts, 



p dx K± (W) = _ r dx A A (AA 

' X] dy' dx \ da / J XI da dx \ dy' J 



V dy 

dy' da 



*2 



XI 



the boundary terms do not contribute, because dy/da = rj (x) vanishes at x\ and 
at % 2 - Thus 



SI 




9/ 

dy 



AA]A da 

d.v dy' j da 



(2.23) 



The expression 



df_d_df d£ S_f 
dy dx dy' Sy 

is called the variational derivative of / by y. It is useful to introduce the notation 

def 

(dy/da)da = Sy and to interpret Sy as an infinitesimal variation of the curve y(;t). 
1(a) assumes an extreme value, i.e. SI — 0. As this must hold true for arbitrary 
variations Sy, the integrand in (2.23) must vanish: 



9/ 

dy 



d 

dx 




= 0 . 



(2.25) 



This is Euler’s differential equation of variational calculus. With a substitution 
of L(q,q,t ) for f(y,y',x), a comparison with (2.18) shows that it is identical 
with Lagrange’s equation d(dL/dq)/dt — dL/dq = 0 (here in one dimension). 
This surprising result is the starting point for the variational principle proposed by 
Hamilton. 
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2.6 Hamilton’s Variational Principle (1834) 



Postulate. To a mechanical system with / degrees of freedom q — 
{<?!, < 72 , we associate a C 2 function of the variables q and q and 

of the time 7, 

L(q,q,t), (2.26) 

called the Lagrangian function. Let 

¥(t) = (<Pi (0, 

in the interval 7i < 7 < ?2 be a physical orbit (i.e. a solution of the equations 
of motion) that assumes the boundary values f(t\) = a and f(t2) — b . This 
orbit is such that the action integral 

/[<?]= r dt L(q(t),q(t),t) (2.27) 

Jti 

assumes an extreme value 2 . 



The physical orbit, i.e. the solution of the equations of motion for the specified 
boundary conditions, is singled out from all other possible orbits that the system 
might choose and that have the same boundary values by the requirement that the 
action integral be an extremum. For suitable choices of the boundary values (t \ , q) 
and (? 2 , b) this will be a minimum. However, the example worked out in Exercise 
2.18 shows that it can also be a maximum. Saddle point values are possible, too. 
We return to this question in Sect. 2.36 (ii) below. 



2.7 The Euler-Lagrange Equations 

A necessary condition for the action integral I[q] to assume an extreme value, for 
q — <p(t), is that <p(t) be the integral curve of the Euler-Lagrange equations 

SL dL d f dL\ 

= =0, (2.28) 

Sq k dq k dr \dq k J 

The proof of this statement proceeds in analogy to that in Sect. 2.5. Indeed, set 
<7(7, a) = <P(t) + a ir{t) with —1 < a < +1 and 4r(t\) — 0 = ^(^ 2 )- If I is an 
extremum for q — <P(t), then 



d 

— Ha) 
da 



= 0 with /(a) = f 



Qf=0 



f d t L(q(t, a), q(t, a), t) 



The name action arises because L has the dimension of energy: the product (energy x time) is 
called action, and this is indeed the dimension of the action integral. 
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and with 



— /(«) = 
da 



rti 

/ d '£ 

Jr i , , 



3 L d q k 3 L dq k 

3 q k da dqk da 



r* 2 

k= 1 

With regard to the second term of this expression we verify that the function 



/ 



, v det ui^ u 

K(t,a) — }——q k (t,a), 
j 3® da 

vanishes at the end points, K(t\,a) = 0 = K{ti,a). Integrating by parts we obtain 



[\ t SLd A ^f 
Ju dqk da J, 



dr \dqk ) da 



and therefore 
d 1(a) 



da 



a=0 



pf 2 f_ 

/ d, £ 

k= i 



3L d / 3L \ 
3% dr \3®/. 



ifkit) = 0 . 



The functions r[/k(t) are arbitrary and independent. Therefore, the integrand must 
vanish termwise. Thus, (2.28) is proved. □ 

Lagrange’s equations follow from the variational principle of Hamilton. They 
are the same as the equations we obtained from d’Alembert’s principle in the case 
where the forces were potential forces. As a result, we obtain / ordinary differen- 
tial equations of second order in the time variable, / being the number of degrees 
of freedom of the system under consideration. 



2.8 Further Examples of the Use of Lagrange’s Equations 

The equations of motion (2.28) generalize Newton’s second law. We confirm this 
statement, in a first step, by verifying that in the case of the n -particle system with- 
out any constraints these equations take the Newtonian form. The second example 
goes beyond the framework of “natural” Lagrangian functions and, in fact, puts 
us on a new and interesting track that we follow up in the subsequent sections. 

Example (i) An n -particle system with potential forces. As there are no forces of 
constraint we take as coordinates the position vectors of the particles. For L we 
choose the natural form 

1 " 

L — T - U — - ^2 inirj -U(r\ r„, t) , 

i = 1 
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q = {q\, . . . ,qf=3n } = {n, ■ • • , r n ) , 

3L 3U d 3L 

dq k dq k dr 3q k ’ 

The notation is meant to indicate that in counting the q k one must insert 
the correct mass of the corresponding particle, i.e. m i for q\, < 72 , < 73 , then m 2 for 
< 74 , < 75 , < 76 - and so on. Written differently, we obtain mf, = — V , U . Thus, in this 
case the Euler-Lagrange equations are nothing but the well-known equations of 
Newton. Therefore, the mechanics that we studied in Chap. 1 can be derived as a 
special case from the variational principle of Hamilton. 



Example (ii) A charged particle in electric and magnetic fields. Here we set q = 
q — {</] , < 72 , < 73 } = {x, y, z}- The motion of a charged, pointlike particle under the 
action of time- and space-dependent electric and magnetic fields is described by 
the equation 

mq = eE(q , t ) H — q(t) x B(q, t) , (2.29) 

c 

where e is its charge. The expression on the right-hand side is the Lorentz force. 
The electric and magnetic fields may be expressed in terms of scalar and vector 
potentials as follows: 

E(q , t) = -\ q 0(q. t) - - A(q , t) 
c 3t 

B{q,t) = V q xA(q,t), (2.30) 

where 0 denotes the scalar potential and A denotes the vector potential. The equa- 
tion of motion (2.29) is obtained, for example, from the following Lagrangian 
function (whose form we postulate at this point): 



L(q, q, t ) = —mq 2 - e&(q , t) + -q ■ A(q, t) . 
2 c 

Indeed, using the chain rule, one verifies that 

3 



3 L _ 30 

dqi dqi 



tr ^ — \ . 



d 3Z, e dAj e 

— — = mqt H — = mq, + - 

at 3qi c at c 



. 3A, 3 Aj 

Lk=\ dqk 9t 



so that from (2.28) there follows the correct equation of motion, 

3 

1 cm , n 1 dAj | e - 

mcp — e 



30 

L dqt 



1 3Ai 
c 3 1 



-£®* 
c z — ' 

jfe=l 



3A k 3 A,- 

L dq t 3 q k 



— eEi + -(jx B)j . 
c 



(2.31) 




2.9 A Remark About Nonuniqueness of the Lagrangian Function 
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Note that with respect to rotations of the frame of reference the Lagrangian 
function (2.31) stays invariant, hence is a scalar while the equation of motion 
(2.29) is a vectorial equation. Both sides transform like vector fields under rota- 
tions. Obviously, scalar, invariant quantities are simpler than quantities that have a 
specific, but nontrivial transformation behaviour. We will come back to this remark 
repeatedly in subsequent sections. 



2.9 A Remark About Nonuniqueness 
of the Lagrangian Function 



In Example (ii) of Sect. 2.8 the potentials can be chosen differently without chang- 
ing the observable field strengths (2.30) nor the equations of motion (2.29). Let x 
be a scalar, differentiable function of position and time. Replace then the potentials 
as follows: 



A(q , t ) -» A\q, t) = A(q , t ) + Vx(?, 0 > 

0(q, t) -T <t>'(q, t ) = &(q,t)~- 1 ) ■ (2.32) 

c dt 



The effect of this transformation on the Lagrangian function is the following: 

e 



T i , . , def 1 . ? 

L 0 q , q , t ) = -mq 



e<t>' + -q ■ A' 



= L(q.q,t) 



9/ . „ 

17 + ?Vx 



= L(q. q. I)+— (- 
\c 



(L ( «.o). 



We see that L is modified by the total time derivative of a function of q and t. 
The potentials are not observable and are therefore not unique. What the example 
tells us is that the Lagrangian function is not unique either and therefore that it 
certainly cannot be an observable. L' leads to the same equations of motion as L. 
The two differ by the total time derivative of a function M(q,t), 



L'iq, q,t) = L(q, q,t) + 



d 



j- M(q , t ) 
dt 



(2.33) 



(here with M — ex/c ). The statement that /.' describes the same physics as L is 
quite general. As the transformation from L to /,' is induced by the gauge trans- 
formation (2.32) of the potentials, we shall call transformations of the kind (2.33) 
gauge transformations of the Lagrangian function. The general case is the subject 
of the next section. 
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2.10 Gauge Transformations of the Lagrangian Function 



Proposition. Let M(q, t) be a C 3 function and let 



/ 



dM 



. . . ^ \ U IV 1 

L (</, q, t) = L(q, q, t ) + 2__, J , — 9k 

k= 1 ° qk 



dM 
~dt~ ' 



Then q(t) is the integral curve of SL' /8q k = 0, k — 1, if and only 
if it is solution of SL/Sq k — 0, k — 1 



Proof. For k — 1, 
SL' SL 



Sq k 8q k 

_ 8L 

Sqk 



+ 



9 

.3 q k 



f calculate 
d 9 
dr dq k 



d 

dr 



dM 



OlVl 9 I \ U1V1 . (J1V1 

Un, I 2 ^ 3 / 7 . yo 



dM 



dM 



3 qk 3 q k 



,i=t 



Hi 



dt 



8L 
Sqk ' 



The additional terms that depend on M(q,t ) cancel. So, if 8L/8q k = 0, then 
8L' /8q k = 0, and vice versa. □ 



Note that M should not depend on the cjj . The reason for this becomes clear 
from the following observation. We could have proved the proposition by means 
of Hamilton’s principle. If we add the term d M(q, r) /dr to the integrand of (2.27), 
we obtain simply the difference M{qi, ti) — M(q i, t[). As the variation leaves the 
end points and the initial and final times fixed, this difference gives no contribution 
to the equations of motion. These equations are therefore the same for L and for 
L' . It is then clear why M should not depend on q: if one fixes ri, ti as well as 
q\,qj, one cannot require the derivatives q to be fixed at the end points as well. 
This may also be read off Fig. 2.6. 

The harmonic oscillator of Sect. 1.17.1 may serve as an example. The natural 
form of the Lagrangian is L = T — U, i.e. 



L = 



1 

2 




and leads to the correct equations of motion (1.39). The function 



U = 



1 

2 




d;i 

+ Zl d7 



leads to the same equations because we have added M — (d~^/dr)/2. 

Lagrange’s equations are even invariant under arbitrary, one-to-one, differen- 
tiable transformations of the generalized coordinates. Such transformations are 
called dijfeomorphisms : they are defined to be one-to-one maps f:U -> V for 
which both / and its inverse f~ l are differentiable. The following proposition 
deals with transformations of this class. 





2.11 Admissible Transformations of the Generalized Coordinates 
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2.11 Admissible Transformations 
of the Generalized Coordinates 



Proposition. Let G : q i-> Q be a diffeomorphism (which should be at 
least C 2 ), g — G~ l its inverse, 

Qi — Gj(q, t) or qk = qk(Q,t) , i,k=l,...,f and g — G~ l . 
In particular, one then knows that 



det(3g ; /32*) ^ 0 . 



(2.34) 



Then the equations SL/Sqk — 0 are equivalent to SL/SQk = 0, k = 
1 i.e. <2(t) is a solution of the Lagrange equations of the trans- 
formed Lagrangian function 



L = Log = L gi(Q, t), . . . , g/(<2, t), f] 



d S l 



^3 Qk 



Qk 



dgi 3 gf 



1*1 V 

dt ’ 



Qk + — , t) 

rj 32* 3r 



(2.35) 



if and only if q(t) is a solution of the Lagrange equations for L(q . q , t). 



Proof Take the variational derivatives of L by the Qk, 8L/8Qk, i.e. calculate 



d 

dt 



dL 



L \ _ ^ d /dL dq, 

' 1 ' ' V 9 ^ 9 2* 



dQk) 




3 L dqi 
dqi dQk 



E 



i=i 



dL dqi 
dqi dQk 



dqi d dL 
dQk df dqi 



(2.36) 



In the second step we have made use of qi — ^d 9 <?//32*)2* + 3 gi/dt, from 
which follows dqi /dQk = dgi/dQk- 
Calculating 



dL 

Wk 



E 



1=1 



dL dqi 
dqi d Qk + 



dL dqi 
dqi dQk 



and subtracting (2.36) yields 

SL — dgi SL 
8Qk “j' 32* 8qi 



(2.37) 
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By assumption the transformation matrix {dgi/dQ^} is not singular; cf. (2.34). 
This proves the proposition. □ 

Another way of stating this result is this: the variational derivatives SL/Sq \ are 
covariant under diffeomorphic transformations of the generalized coordinates. 

It is not correct, therefore, to state that the Lagrangian function is “T — U”. 
Although this is a natural form, in those cases where kinetic and potential energies 
are defined it is certainly not the only one that describes a given problem. In gen- 
eral, L is a function of cj and q, as well as of time t, and no more. How to construct 
a Lagrangian function is more a question of the symmetries and invariances of the 
physical system one wishes to describe. There may well be cases where there is no 
kinetic energy or no potential energy, in the usual sense, but where a Lagrangian 
can be found, up to gauge transformations (2.33), which gives the correct equa- 
tions of motion. This is true, in particular, in applying the variational principle of 
Hamilton to theories in which fields take over the role of dynamical variables. For 
such theories, the notion of kinetic and potential parts in the Lagrangian must be 
generalized anyway, if they are defined at all. 

The proposition proved above tells us that with any set of generalized coordi- 
nates there is an infinity of other, equivalent sets of variables. Which set is chosen 
in practice depends on the peculiarities of the system under consideration. For ex- 
ample, a clever choice will be one where as many integrals of the motion as possi- 
ble will be manifest. We shall say more about this as well as about the geometric 
meaning of this multiplicity later. For the moment we note that the transforma- 
tions must be diffeomorphisms. In transforming to new coordinates we wish to 
conserve the number of degrees of freedom as well as the differential structure 
of the system. Only then can the physics be independent of the special choice of 
variables. 



2.12 The Hamiltonian Function and Its Relation 
to the Lagrangian Function L 



It is easy to convince oneself of the following fact. If the Lagrangian function L 
has no explicit time dependence then the function 

/ dL 

H(q , q) = T ]&— - L(q, q) (2.38) 

' ' k=l dqk 

is a constant of the motion. Indeed, differentiating with respect to time and making 
use of the equations SL/Sq * = 0, one has 



d H 
dr 




dL d dL 

qi ~tr- qi T7 ~trr~ 
oc[i at oQi 



dL dL ' 

- i—m - 

oqt d qi \ 



= 0 . 



Take as an example L = mr 2 /2 — U(r) = T — U. Equation (2.38) gives H(r, r) — 
2T — (T — U) — T + U — (mr 2 / 2) + U (r). If we set mr — p, H goes over into 
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H(r, p) — p 2 /2m + U(r). In doing so, we note that the momentum p is given 
by the partial derivative of L by i, p = (dL/d.x, dL/dy, dL/dz ). This leads us 
to the definition in the general case 3 



def 3 L 
Pk = ^ , 

oqu 



(2.39) 



where p k is called the momentum canonically conjugate to the coordinate q k . One 
reason for this name is that, for the simple example above, the definition (2.39) 
leads to the ordinary momentum. Furthermore, the Euler-Lagrange equation 

SL _ dL d /a L 

Sq k dq k dr \dq k 

tells us that this momentum is an integral of the motion whenever dL/dq k = 0. 
In other words, if L does not depend explicitly on one (or several) of the q k , 

L = L(q\ q k -uq k +i, ...qf,q\, q k , . . .q/, t) , 

then the corresponding, conjugate momentum (momenta) is an (are) integral(s) of 
the motion, p k = const. If this is the case, such generalized coordinates q k are 
said to be cyclic coordinates. 

The question arises under which conditions (2.38) can be transformed to the 
form H(q, p, t). The answer is provided by what is called Legendre transforma- 
tion, to whose analysis we now turn. 




2.13 The Legendre Transformation for the Case 
of One Variable 



Let f{x) be a real, differentiable function (at least C 2 ). Let y = fix), z = df/dx 
and assume that d~ f/dx 2 f 0. Then, by the implicit function theorem, x — g(z), 
the inverse function of z — df(x)/dx, exists. The theorem also guarantees the 
existence of the Legendre transform of /, which is defined as follows: 

(£/)(*) = fix) = g(z)z - f(g(z )) = Cf(z) . (2.40) 



Thus, as long as d 2 //d.x 2 0, Cf(z) is well defined. It is then possible to con- 

struct also CC f(z), i.e. to apply the Legendre transformation twice. One obtains 



d 

dz 



£f (z) = giz) + z 



dz 



df Ax dg dg 

-7- ~T~ = x + z l = x ' 

Ax dz dz dz 



3 There are cases where one must take care with the position of indices: cp (superscript), hut 
pj = dL/dq J (subscript). Here we do not have to distinguish between the two positions yet. 
This will be important, though, in Chaps. 4 and 5. 
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Its second derivative does not vanish, because 

# 0 . 



d 2 dx 1 

d z 2 dz d 2 //dx 2 



Therefore, if we set £f(z) = 0{z) — xz — /, 

CC f(z ) = C<P — z— 0(z) = zx — xz + f = f 

dz 

This means that the transformation 



/ -> Cf 

is one-to-one whenever d 2 //dx 2 ^ 0. 

For the sake of illustration we consider two examples. 

Example (i) Let f(x) — mx 2 / 2. Then z — d//dx = mx and d 2 f /dx 2 — m ^ 0. 
Thus x = g(z) = z/m and Cf(z) — ( z/m)z — m(z/m) 2 / 2 — z 2 /2m. 

Example (ii) Let /(x) = x a /a. Then z — x a ~ l , d 2 f/dx 2 = (a — l)x“ -2 ^ 0, 
provided a 1, and, if a 2, provided also x ^ 0. The inverse is 

X = g(z) — z 1/(a ~ 1) 



and therefore 

Cf(z) = z l/(a - l \ - -z“ /( “ _I) = — z“/ ( “- 1} ^ -z? with p = — — — . 

a a p a — 1 

We note the relation l/ar + l//J = l. Asa result we have 

fix) = —x a -o- Cf (z) = \z p with - + -1 = 1. 

a C p up 

If a Lagrangian function is given (here for a system with f — 1), the Legendre 
transform is nothing but the passage to the Hamiltonian function that we sketched 
in Sect. 2.12. Indeed, if x is taken to be the variable q and /(x) the function 
L(q , q , t ) of q , then according to (2.40) 

CL(q, q. t ) = q(q, p, t) ■ p - L(q , q(q, p, t ), t) = H(q , p, t ) , 

where q(q. p,t) is the inverse function of 

dL 

P = -rr(q,q,t) ■ 
oq 

The inverse exists if d 2 L/dq 2 is nonzero. If this condition is fulfilled, q can be 
eliminated and is expressed by q. p, and t. In the case studied here, the initial func- 
tion also depends on other variables such as q and t. Clearly their presence does 
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not affect the Legendre transformation, which concerns the variable q. (However, 
in the general case, it will be important to state with respect to which variable the 
transform is taken.) 

With the same condition as above one can apply the Legendre transformation to 
the Hamiltonian function, replacing p by p(q . q , t) and obtaining the Lagrangian 
function again. 

The generalization to more than one degree of freedom is easy but requires a 
little more writing. 



2.14 The Legendre Transformation 
for the Case of Several Variables 



Let the function F(x \, . . . , x„,\ mi, ... , u n ) be C 2 in all x k and assume that 



3 2 F 
dxkdxj 

The equations 
3 F 

y k = - — (xi, . ,.x m ; ui «») , k = 1,2 m 

OXk 

can then be solved locally in terms of the Xj, i.e. 

Xi =(pi(yi,..., y m ; «i, . . . u n ) , i — 1, 2, . . . , m . 
The Legendre transform of F is defined as follows: 

m 

G(yi , . . . , y,n ; M l u n ) = CF — ^ y k (p k - F . 

k= 1 





(2.41) 



(2.42) 



(2.42) 



(2.43) 



We then have 

3G 3 G 3 F ( 3 2 G \ ( 3 2 F \ 

= (p k ; = and det det 1 = 1. 

dyic dui dui \dyicdyij \dxidxjj 

As in the one-dimensional case this transformation is then one-to-one. This result 
can be applied directly to the Lagrangian function if we identify the variables 
(xj . . . x m ) with (q\ . . . q/) and {u\ . . . u „ ) with (qi ... qq, t). We start from the 
function L(q, q . t) and define the generalized momenta as in (2.39): 



def 3 . . 

Pk = ir^L(q, q , t) 
oqk ~ ~ 



These equations can be solved locally and uniquely in terms of the q k precisely 
if the condition 
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det 



d 2 L 

dq k dqi 



#0 



(2.44) 



is fulfilled 4 . In this case qk — qk(q > P, t) and the Hamiltonian function is given by 



/ 

H(q, p, t ) — CL(q, p, t) = ^ puqidq , P, 0 - - q(q, P, t), t ) . 

fc=t 



With the same condition (2.44), a two-fold application of the Legendre transfor- 
mation leads back to the original Lagrangian function. 

Is it possible to formulate the equations of motion by means of the Hamiltonian 
instead of the Lagrangian function? The answer follows directly from our equations 
above, viz. 

dH ( d 2 H \ 

qk = , det - — — ^ 0 , and 

3 Pk \opjdpiJ 

3 H dL 

dqk 3 qk Pk 

We obtain the following system of equations of motion: 



1 3 H 


3 H 




?S- 

II 

05 1 
| 


Pk = -t— k= 1, . . 
dqk 





(2.45) 



These equations are called the canonical equations. They contain only the Hamil- 
tonian function // (q . p, t) and the variables q. p. t. We note that (2.45) is a system 
of 2/ ordinary differential equations of first order. They replace the / differential 
equations of second order that we obtained in the Lagrangian formalism. They are 
completely equivalent to the Euler-Lagrange equations, provided (2.44) holds. 



2.15 Canonical Systems 

Definition. A mechanical system is said to be canonical if it admits a Hamiltonian 
function such that its equations of motion take the form (2.45). 



Proposition. Every Lagrangian system that fulfills the condition (2.44) is 
canonical. The converse holds also: if det(3 2 H /dpkdp/) ^ 0, then every 
canonical system with / degrees of freedom obeys the Euler-Lagrange 
equations with L(q, q, t) given by 

/ 

L(q ,q ,f) = CH (q ,q ,t) — E qk Pk (q , q, t) — H(q, p(q, q, t ), t) . (2.46) 
k= l 



4 In mechanics the kinetic energy and. hence, the Lagrangian are positive-definite (but not neces- 
sarily homogeneous) quadratic functions of the variables q In this situation, solving the defining 
equations for p £ in terms of the qi yields a unique solution also globally. 
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Remarks: One might wonder about the specific form (2.40) or (2.43) of the Leg- 
endre transformation which when supplemented by the condition (2.44) on the 
second derivatives, guarantees its bijective, in fact diffeomorphic nature. The fol- 
lowing two remarks may be helpful in clarifying matters further. 

1 . For simplicity, let us write the equations for the case of one degree of free- 
dom, / = 1, the generalization to more than one degree of freedom having been 
clarified in Sect. 2.14. Depending on whether the Lagrangian function or the Hamil- 
tonian function is given, one constructs the hybrid 



H(q, q) — q 

L(q , p) = 



. 3 L(q,q) 



3 q 
3 H(q, p) 
dp 



■ ~ L(q , q) , 

p - H(q, p) , 



or 



i.e. auxiliary quantities that still depend on the “wrong” variables. If the first or 
the second of the conditions 



3 2 L(q,q) 3 2 H(q, p) 

dqdq ' dpdp 



is fulfilled then the equations 



3 L 3 H 

P= Tq ’ q = 

can be solved for q as a function of q and p in the first case, for p as a function 
of q and q in the second, so that the transition from L(q. q) to // (q . p), or the 
inverse, from H(q, p) to L(q, q) becomes possible. An important aspect of Leg- 
endre transformation is the obvious symmetry between L and H. The condition 
on the second derivatives guarantee its uniqueness. 

2. The condition on the second derivative tells us that the function to be trans- 
formed is either convex or concave. In this connection it might be useful to consult 
exercise 2.14 and its solution. In fact, for the Legendre transformation to exist, 
the weaker condition of convexity of the function (or its negative) is the essen- 
tial requirement, not its differentiability. This weaker form is important for other 
branches of physics, such as thermodynamics of equilibrium states, or quantum 
field theory. 



2.16 Examples of Canonical Systems 

We illustrate the results of the previous sections by two instructive examples. 

Example (i) Motion of a particle in a central field. As the angular momentum 
is conserved the motion takes place in a plane perpendicular to l. We introduce 
polar coordinates in that plane and write the Lagrangian function in its natural 
form. With x\ — r cos cp, X 2 — r sin<p, one finds v 2 — r 2 + r 2 (p 2 , and thus 
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L = T — U(r ) = \m(r 2 + r 2 tp 2 ) - U(r) . (2.47) 

Here q\ = r, q 2 — <p, and p\ = p r = mr, p 2 = p<p = mr 2 <p. The determinant of 
the matrix of second derivatives of L by the qj is 

det ( L ^ = det = m 2 r 2 ^ 0 for r/0. 

\dqjdqij V° mr ) 

The Hamiltonian function can be constructed uniquely and is given by 



H(q , p) = 




2m 




2m r 2 



+ U (r ) . 



(2.48) 



The canonical equations (2.45) read as follows: 

dH 1 dH 1 p v 

r = ^—= ~Pr • <P = = 2 ’ 

6p r m dpy m r A 

dH Pi dU . 

Pr = -~p~ = — 3 — Tj ’ Pv = °- 
dr mr 3 dr 



(2.49a) 

(2.49b) 



Comparison with Example 1.24 shows that p, P = I is the modulus of angular 
momentum and is conserved. Indeed, from the expression (2.47) for L, we note 
that <p is a cyclic coordinate. The first equation (2.49b), when multiplied by p r 
and then integrated once, gives (1.62) of Example 1.24. This shows that II (q . p) 
is conserved when taken along a solution curve of (2.45). 



Example (ii) A charged particle in electromagnetic fields. Following the method 
of Example (ii) of Sect. 2.8 we have 



1 9 e 

L — mq — e&(q, t) H — q ■ A(q, t ) . 

2 c 

The canonically conjugate momenta are given by 

dL e 

Pi — ~p~r~ — mqi + - Aj (q , t) . 
dqt c 

These equations can be solved for qi, 

1 e 

qi — ~Pi At , 

m cm 

so that one obtains 

3 



i — 1 i=l 

+e0 _ ^y( Pi _ e _ Ai \ Ai 

me * — 4 \ c / 



(2.50) 



i= 1 
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or 

H(q, p,t) = (p - -A(q,t)) +e&(q,t). (2.51) 

Note the following difference: 

e . 

mq = p A 

c 

is the kinematic momentum, 
dL 

Pi with pi = — 

dqi 

is the (generalized) momentum canonically conjugate to cp. 



2.17 The Variational Principle 

Applied to the Hamiltonian Function 



It is possible to obtain the canonical equations (2.45) directly from Hamilton’s 
variational principle (Sects. 2.5 and 2.6). For this we apply the principle to the 
following function: 



/ 

F(q, p, q, p, f) = V p k q k - H(q, p, t) , 
k=\ 



(2.52) 



taking the q,p,q,p as four sets of independent variables. In the language of 
Sect. 2.5, (q, p) corresponds to y, (q , p) to y' , and t to j. Requiring that 

S f~ Fdt = 0 (2.53) 

Jti 

and varying the variables q k and p k independently, we get the Euler-Lagrange 
equations SF/Sq k — 0, SF/Sp k = 0. When written out, these are 

d 8F _ 8F 
df 3 q k d q k 

d dF _ dF 
d t d p k 3 p k ' 

Thus, we again obtain the canonical equations (2.45). We shall make use of this 
result below when discussing canonical transformations. 



or p k = 



3 H 
3 qk 



and 



3 H 

or 0 = q k - - — . 

3 Pk 
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2.18 Symmetries and Conservation Laws 

In Sects. 1.12 and 1.13 we studied the ten classical integrals of the motion of the 
closed n -particle system, as derived directly from Newton’s equations. In this sec- 
tion and in the subsequent ones we wish to discuss these results, as well as general- 
izations of them, in the framework of Lagrangian functions and the Euler-Lagrange 
equations. 

Here and below we study closed, autonomous systems with / degrees of free- 
dom to which we can ascribe Lagrangian functions L(q, q) without explicit time 
dependence. Take the natural form for /,, 

L = T (q, q) — U (q) , (2.54) 



where T is a homogeneous function of degree 2 in the q^. According to Euler’s 
theorem on homogeneous functions we have 



=27\ (2.55) 

tl 9 * 



so that 



/ 

i=i 



E O t-j . 

q: -L = T + U = E . 

dqi 



This expression represents the energy of the system. For autonomous systems, E 
is conserved along any orbit. Indeed, making use of the Euler-Lagrange equations, 
one finds that 



d E 
dr 



d ^ . \ ^ \ dL , ^ \ 3 L .. 

5 (e™) - 5 (e !£*)-<>■ 



Note that we made use of the Euler-Lagrange equations. 



Remarks: In the framework of Lagrangian mechanics a dynamical quantity such 
as, for instance, the energy which is a candidate for a constant of the motion, at 
first is a function E(q,q) on velocity space spanned by q and q. Likewise, in 
the framework of Hamiltonian canonical mechanics it is a function E(q . p) on 
phase space that is spanned by q and p. Of course, such a function on either 
velocity space or phase space, in general, is not constant. It is constant only - 
if it represents an integral of the motion - along solutions of the equations of 
motion. In other terms, its time derivative is equal to zero only if it is evaluated 
along physical orbits along which q and q, or q and p, respectively, are related to 
each other via the equations of motion. For this reason this kind of time derivative 
is sometimes called orbital derivative, as a short-hand for time derivative taken 
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along the orbit. The important point to note is that in order to study the variation 
of a given function along physical orbits we need not know the solutions proper. 
Knowledge of the differential equations that describe the motion is sufficient for 
calculating the orbital derivative and to find out, for instance, whether that function 
is an integral of the system. 

Suppose that the mechanical system one is considering is invariant under a class 
of continuous transformations of the coordinates that can be deformed smoothly 
into the identical mapping. The system then possesses integrals of the motion , i.e. 
there are dynamical quantities that are constant along orbits of the system. The 
interesting observation is that it is sufficient to study these transformations in an 
infinitesimal neighborhood of the identity. This is made explicit in the following 
theorem by Emmy Noether, which applies to transformations of the space coordi- 
nates. 



2.19 Noether’s Theorem 



Let the Lagrangian function L(q, q) describing an autonomous system be 
invariant under the transformation q h s (q), where s is a real, continuous 
parameter and where h s=0 (q ) — q is the identity (see Fig. 2.7). Then there 
exists an integral of the motion, given by 



d , 

Hq,q) = ^2 T-r- -T-h (qi) 

“ dqt ds 



s=0 



(2.56) 




Fig. 2.7. A differentiable, one-parameter transforma- 
tion of the orbits in the neighborhood of the identical 
mapping. If it leaves the Lagrangian function invari- 
ant, there exists a constant of the motion correspond- 
ing to it 



Proof. Let q = fit) be a solution of the Euler-Lagrange equations. Then, by 
assumption, q(s , t) = 0{s, t) = h s (<P(t)) is a solution too. This means that 



d 31 • dL 

— — (0 (s, t ), 0(s, t)) = - — {0(s, r), 0(s, t)) , 
dr dq t dqi 



* = !>•••)/ • 



(2.57) 



Furthermore, L is invariant by assumption, i.e. 
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as 



d L 



f 

Si dqi 



d&j 

d.v 



dL d &i 
dqi ds 



Combining (2.57) and (2.58) we obtain 



E 



i= 1 



d 

d/ 




d &j 
d.? 



dL d /d 0j\ 
dq, df \ ds ) 




(2.58) 



□ 



We study two examples; let the Lagrangian function have the form 

1 " 

L = - ^ ~2»i p r 2 p - U(r i, ... ,r„) . 

P =\ 



Example (i) Assume that the system is invariant under translations along the x- 
axis: 

h s : r p i-> r p + se x , p — l, ... ,n . 

We then have 

d " 

—h s (r p ) L =0 = e - x and 1 = Yjn p x (p> - P x . 

S p=i 

The result is the following. Invariance under translations along the x-axis implies 
conservation of the projection of the total momentum onto the x-axis. Similarly, 
if the Lagrangian is invariant under translations along the direction h, then the 
component of total momentum along that direction is conserved. 

Example (ii) The same system is now assumed to be invariant under rotations 
about the z-axis, cf. Fig. 2.8: 

r p = (x ip \ y ip \ z (p) ) -> r' p = (x ' (p) , y' ip \ z' ip) ) with 

x '(p) — x ip) COSJ -p yip) sin s , 

y'( p ) — _ x (p ) s j n s _|_ yip) coss > (passive rotation) 
z ’i p ) = Z ( P ) 

Here one obtains 

J-/p L=o= (/ p \-x (p \0)=r p xe z 



1 = ’^2 m p i ’p • ( r p x e z) = E^ ' x r p) = ~ l z ■ 

p = i 



and 
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Fig. 2.8. If the Lagrangian function for a mass point is invariant 
by rotations about the z-axis, the projection of angular momen- 
tum onto that axis is conserved 



The conserved quantity is found to be the projection of the total angular momentum 
onto the "-axis. More generally, if L is invariant under rotations about any direction 
in space (the potential energy must then be spherically symmetric), then the total 
angular momentum is conserved as a whole. 

To which extent there exists an inverse to Noether’s theorem, that is to say under 
which conditions the existence of an integral of the motion implies invariance of the 
system with respect to a continuous transformation, will be clarifyed in Sects. 2.34 
and 2.41 below. 



2.20 The Generator for Infinitesimal Rotations About an Axis 



In the two previous examples and, generally, in Noether’s theorem h s is a one- 
parameter group of diffeomorphisms that have the special property that they can be 
deformed, in a continuous manner, into the identity. The integral of the motion, I, 
only depends on the derivative of h s at .v = 0, which means that the transformation 
group is needed only in the neighborhood of the identity. Here we wish to pursue 
the analysis of Example (ii), Sect. 2.19. This will give a first impression of the 
importance of continuous groups of transformations in mechanics. 

For infinitesimally small .v the rotation about the z-axis of Example (ii) can be 
written as follows: 



h s (r ) 



/ 1 0 0 \ /0 -1 0 \ 

0 1 0 — s i 0 0 

\0 0 1 / \0 0 0 / 

(H-sJ z )r + 0(s 2 ) . 




+ 0(s 2 ) 



(2.59) 



The 3x3 matrix J ; is said to be the generator for infinitesimal rotations about 
the z-axis. In fact, one can show that the rotation about the z-axis by a. finite angle 



def D , , 
r = R z {tp)r 



cos ip sin ip 0 
— sin i p cos <p 0 
0 0 1 



(2.60) 
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can be constructed from the infinitesimal rotations (2.59). This is seen as follows. 
Let 



- - def / 0 — 1 

M = 



One verifies easily that M 2 = — H, M 3 = — M. M 1 = + II , etc. or, more generally, 
that M 2 " = ( — 1)" H, M 2 " +1 = (-1)"M. 

From the well-known Taylor series for the sine and cosine functions one has 

A W cos, s in^ = 1 f(zD! 

\ — sin , cos,/ (2/r)! ^ (2n + 1)! 

n = 0 n = 0 

and, inserting the formulae for the even and odd powers of the matrix M. 



A = £ 



(2/7)! 



M 2 ”, 2 " 



£ 



i 



(2 n + 1)! 



M ln+ '<p ln+ ' = exp(-M ,) . 



(2.61) 



It is then not difficult to convince oneself that the 3x3 matrix R-(,) of (2.60) 
can also be written as an exponential series, as in (2.61), viz. 



R Z (V) = exp(-J ,,) . 



(2.62) 



This result can be understood as follows. In (2.59) take s = ,/n with n a positive 
integer, large compared to 1. Assume then that we perform n such rotations in a 
series, i.e. 




Finally, let n go to infinity. In this limit use Euler’s formula for the exponential, 
lim(l + x/n )" = e A , to obtain 

/ *P \ n 

lim n J z ) = exp(-J-,) . 

n->o o V n / 

Clearly, these results can be extended to rotations about any other direction in 
space. 

The appearance of finite-dimensional matrices in the argument of an exponen- 
tial function is perhaps not familiar to the reader. There is nothing mysterious about 
such exponentials. They are defined through the power series 

exp{A} = 1 + A + A 2 /2! + . . . + A k /k\ + ... , 

where A. like any finite power A k in the series, is an n x n matrix. As the expo- 
nential is an entire function (its Taylor expansion converges for any finite value 
of the argument), there is no problem of convergence of this series. 
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2.21 More About the Rotation Group 

Let x — (xi,X 2 ,Xi) be a point on a physical orbit x{t), x\,X 2 and X 3 being its 
(Cartesian) coordinates with respect to the frame of reference K. The same point, 
when described within the frame K' whose origin coincides with that of K but 
which is rotated by the angle cp about the direction ip. is represented by 

3 

jc|k' = (x[, x' 2 , X 3 ) with x'j =’^2 / RikXk or x' = Rx . (2.63) 

k= I 

(This is a passive rotation.) By definition, rotations leave the length unchanged. 
Thus x' 2 — x 2 , or, when written out more explicitly, 

(Rjc) • (Rjr) = xR r Rx = x 2 , 
or, in components and even more explicitly, 

3 33/3 \ 33 

J2 X i X 'i = EE (E*' - ** il I XkXl = EE Skixkxi . 

i= 1 k= 1 Z=1 \/=l / k= 1 1=1 

One thus obtains the condition 

3 

E( RT )«*« = Ski ’ 

i= 1 

i.e. R must be a real, orthogonal matrix: 

R t R = 11 . (2.64) 

From (2.64) one concludes that (detR ) 2 = 1 or detR = ±1. 

We restrict the discussion to the rotation matrices with determinant +1 and 
leave aside space inversion (cf. Sect. 1.13). The matrices R with det R = +1 form 
a group, the special orthogonal group in three real dimensions 

SO(3) = {R : R 3 -» R 3 linear | det R = +1, R T R = 1} . (2.65) 

As shown in Sect. 1.13, every such R depends on 3 real parameters and can be 
deformed continuously into the identity R° = 1. A possible parametrization is the 

/v def 

following. Take a vector (p whose direction <p — <p/(p defines the axis about which 
the rotation takes place and whose modulus q> = \<p\ defines the angle of rotation, 
as indicated in Fig. 2.9: 

R = R(y?) with (p . (p — tp/(p , (p = \<p\ , 0 < <p < 2n . (2.66) 

(We shall meet other parametrizations in developing the theory of the rigid body 
in Chap. 3.) The action of R(^>) on x can be expressed explicitly in terms of the 
vectors x, <p x x and <p x dp x x), for a passive rotation, by 
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Fig. 2.9. Rotation of the coordinate system about the 
direction (p by the (fixed) angle (p 



x' — R(<p)x = (<p ■ x)(p — <p x x sin</> — ip x (ip x x) cos <p . (2.67) 

This is shown as follows. The vectors ip. <pxx, and ipx(ipxx) are mutually orthog- 
onal. For example, if the 3-axis is taken along the direction ip. i.e. if ip — (0, 0, 1), 
then x = (xi,X2, X3), ip x x = (—xi, x\, 0), and ip x (ip x x) — (— xj, —X 2 , 0). 
With respect to the new coordinate system the same vector has the components 

Xj = x\ cos cp + X 2 sin <p , x' 2 = —x\ sin <p + X 2 cos <p , X3 = X3 , 

in agreement with (2.67). One now verifies that (2.67) holds true also when qt is 
not along the 3-axis. The first term on the right-hand side of (2.67) contains the 
information that the projection of x onto ip stays invariant, while the remaining 
two terms represent the rotation in the plane perpendicular to ip. Making use of 
the identity a x (b x c) = b(a ■ c) — c(a ■ b), (2.67) becomes 

x' = xcos (p — ip x x sin<p + (1 — cos cp)(q> ■ x)ip . (2.68) 

We now show the following. 

(i) R(>) as parametrized in (2.67) belongs to SO(3): 

x ' 2 = ( ip ■ x ) 2 + {(p x x ) 2 sin 2 ip + (ip x (ip x x)) 2 cos 2 ip 

2 2 2 2 2 2 2 

= x“[cos“ a + sin“ a sin“ ip + sin a cos“ ip] = x . 

Here a denotes the angle between the vectors ip and x (see Fig. 2.10). If ip and ip' 
are parallel, then R(^> )R(^>) = R {(p + ip'). This means that R(^) can be deformed 
continuously into the identity R(0) = 11 and therefore that del R ( </) ) = +1. 




Fig. 2.10. Definition of the angle a between the position 
vector x and the direction about which the rotation takes 
place 
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(ii) Every R of SO(3) can be written in the form (2.67). Consider first those 
vectors x which remain unchanged under R (up to a factor), Rx = Xx. This means 
that 



det(R — XT) = 0 

must hold. This is a cubic polynomial with real coefficients. Therefore, it always 
has at least one real eigenvalue X, which is +1 or —1 because of the condition 
(Rx) 2 = x 2 . In the plane perpendicular to the corresponding eigenvector, R must 
have the form 

cos 4 / sin 4> 

— sin 4* cos 4> 

in order to fulfill the condition (2.64). Finally, 4 / must be equal to tp because 
det R = 1 and because R can be deformed continuously into the identity. Thus, 
R(cp) must have the decomposition (2.67). 



2.22 Infinitesimal Rotations and Their Generators 



Assume now that <p = s <ZC 1. Then, from (2.67) and (2.68), respectively, 

x' — (<p-x)<p — ((p xx)e — (/> x (<p xx) + 0(e 2 ) = x — (<p xx)e + 0(e 2 ). (2.69) 
Writing this out in components, one obtains 



x — s 



x — s 



-<P3 

0 

<P1 



-<Pi I X + 0(e ) 
0 



^0 0 0) 

0 0 -1 | (p\ + 

t0 1 0; 



0 0 1 N 

0 0 0 I q>2 

-1 0 0; 



(0 


-i 


°\ 






1 


0 


0 <p 3 


x + 0(e 2 ) . 


(2.70) 


\o 


0 


0/ J 







This is the decomposition of the infinitesimal rotation into rotations about the three 
directions q> q> 2 , and <p^. Denoting the matrices in this expression by 

/0 0 0 \ / 0 0 1 \ /0 -1 0 \ 

J!= f 0 0 -1 , J 2 = f 0 0 0 , J 3 = f 1 0 0 , 

\0 1 0/ \-l 0 0/ \0 0 0/ 

(2.71) 



and using the abbreviation J = (Jj, J 2 , J 3 ), (2.70) takes the form 
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x' = [1- eq> • J]jc + 0(£ 2 ) . (2.72) 

Following Sect. 2.20 choose e = (p/n and apply the same rotation n times. In the 
limit n —*■ oo one obtains 

x' — lim ( 11 — —q> ■ j) x = exp (— <p • J)x . (2.73) 

n -> oo \ n / 

Thus, the finite rotation R(<p) is represented by an exponential series in the matrices 
J = (J i , J2, J3) and the vector tp. J/, is said to be the generator for infinitesimal 
rotations about the axis k. 

As before, the first equation of (2.73) can be visualized as n successive rota- 
tions by the angle (p/n. In the limit of n going to infinity this becomes an infinite 
product of infinitesimal rotations. By Gauss’ formula this is precisely the expo- 
nential indicated in the second equation of (2.73). It is to be understood as the 

well-known exponential series ' /w!)A" in the 3x3 matrix A = f (—(p • J). 

The matrices R (<p) form a compact Lie group (its parameter space is compact). 
Its generators (2.71) obey the Lie algebra associated with this group. This means 
that the commutator (or Lie product), 

[Ji.J*] = , 

of any two of them is defined and belongs to the same set { } . Indeed, from (2.71) 
one finds that 

[ J 1 , J2] = J3 > [Ji , J3] = — J2 

together with four more relations that follow from these by cyclic permutation 
of the indices. As the Lie product of any two elements of { J j , J2, J3} is again 
an element of this set, one says that the algebra of the J* closes under the Lie 
product. 

Via (2.72) and (2.73) the generators yield a local representation of that part 
of the rotation group which contains the unit element, the identical mapping. This 
is not sufficient to reconstruct the global structure of this group (we do not reach 
its component containing matrices with determinant —1). It can happen, therefore, 
that two groups have the same Lie algebra but are different globally. This is indeed 
the case for SO(3), which has the same Lie algebra as SU(2), the group of complex 
2x2 matrices, which are unitary and have determinant +1 (the unitary unimodular 
group in two complex dimensions). The elements of the rotation group are differ- 
entiable in its parameters (the rotation angles). In this sense it is a differentiable 
manifold and one may ask questions such as: Is this manifold compact? (The ro- 
tation group is.) Is it simply connected? (The rotation group is doubly connected, 
see Sect. 5.2.3 (iv) and Exercise 3.11.) 
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2.23 Canonical Transformations 



Of course, the choice of a set of generalized coordinates and of the corresponding 
generalized, canonically conjugate momenta is not unique. For example, Propo- 
sition 2.11 taught us that any diffeomorphic mapping of the original coordinates 
q onto new coordinates Q leaves invariant Lagrange’s formalism. The new set 
describes the same physics by means of a different parametrization. Such trans- 
formations are useful, however, whenever one succeeds in making some or all of 
the new coordinates cyclic. In this case the corresponding generalized momenta 
are constants of the motion. Following Sect. 2.12, we say that a coordinate Qk is 
cyclic if L does not depend explicitly on it, 



dL 

Wk 



(2.74) 



If this is the case, then also dH/dQk — 0, and 



Pk = ~ 



dH 

Wk 



= o , 



(2.75) 



from which we conclude that If = a/. = const. The canonical system described 
by 

H(Q\, Qk- 1 , Qk+u ■■■ , Qf\ Pi, Pk-i,<*k, p k+ 1 , • • • Pf, t ) 



is reduced to a system with / — 1 degrees of freedom. For instance, if all Qk are 
cyclic, i.e. if 



H = H{P u ...,P f \t ), 



the solution of the canonical equations 
Pi — 0 — >• Pj — at = const , i = 



a dH 

Qi = — 

^ dPi 



def . , 
= Vi(t) 



Pi=C/i 



is elementary, because 
1 ,...,/, and 



from which the solutions are obtained by integration, viz. 



Qi = 



f 

Jtn 



Vi (t)dt + ^ , 






The 2/ parameters {a,-, /3,-} are constants of integration. 

This raises a general question: Is it possible to transform the coordinates and 
momenta in such a way that the canonical structure of the equations of motion is 
preserved and that some or all coordinates become cyclic? This question leads to 
the definition of canonical transformations. 
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Diffeomorphic transformations of the variables q and p and of the Hamiltonian 
function // (q . p, r), generated by a smooth function of old and new variables in 
the way described below 

[q,p}^{Q,P) , 

H(q, p, t) -> H (Q, P, 7) , (2.76) 



are said to be canonical if they preserve the structure of the canonical equations 
(2.45 ) 5 . Thus, with (2.45) we shall also find that 



Qi = 



8H 
dPi ’ 



8H 



9 Qi ' 



(2.77) 



In order to satisfy this requirement the variational principle (2.53) of Sect. 2.17 
must hold for the system {q, p. H] as well as for the system { Q. p, H}, viz. 



S 



S 




f 

T, Pith ~ H (q , p, t) 
1 



df = 0 , 



f 

ypiQi-H(g,p,t) 

1 



dr = 0 . 



(2.78) 



(2.79) 



Proposition 2.10 tells us that this is certainly true if the integrands in (2.78) and 
(2.79) do not differ by more than the total time derivative of a function M: 



■A A d 

y Pi qi - H (q , p, t) = y Pj Qj - H(Q, P, t) + —M , (2.80) 

i=l 7=1 



where M depends on old and new variables (but not on their time derivatives) and, 
possibly, time. There are four ways of choosing M, corresponding to the possible 
choices of old coordinates/momenta and new coordinates/momenta. These four 
classes can be obtained from one another by Legendre transformation. They are 
as follows. 

(A) The choice 



M(.q,Q,t) = <P(.q,Q,t). 



In this case we obtain 



d M 
dr 



d 0(q, Q,t ) 

dr 



9<P 

~dT 



E 



7=1 




80 

Wj 




(2.81) 



(2.82) 



5 



See the precise definition in Sect. 5.5.4 below 
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As q and Q are independent variables, (2.80) is fulfilled if and only if the following 
equations hold true: 




9<Z> 

Wj ’ 



H = H 



80 

~di 



(2.83) 



The function 0 (and likewise any other function M) is said to be the generating 
function of the canonical transformation. The first equation of (2.83) can be solved 
for Q k {q , p, t ) if 

( 3 2<p \ 

and the second can be solved for Qifq ■ P ■ t) if 
/ d 2 <P \ 

H-ioMi)* 0 ' ,284b) 



(B) The choice 



/ 

M(q, P, t) = S(q , P,t)~J2 Qk(q > P, t)Pk ■ 

k— I 



(2.85) 



This is obtained by taking the Legendre transform of the generating function (2.81) 
with respect to Q: 



(C0)(Q) = QkjQ~ k - 0(9, Q,t) = - [J2 QkPk + 0 ] • 

We then have 

/ 

%, P, t ) = Qkiq, P, t)P k + 0(q, Q(q, P, t ), t ) . (2.86) 

*=1 



With the condition (2.84b) Qk can be solved for cj and P. From (2.83) and (2.86) 
we conclude that 




Qk 



dS 

~d~Pk ’ 



3S 

H = H -\ . 

dt 



(2.87) 



The same equations are obtained if the generating function (2.85) is inserted into 
(2.80), taking into account that q and P are independent. 

(C) The choice 



/ 

M(Q, p, t) = U(Q, p,t) + Y2qk(Q, p,t)pk . 

k= 1 



(2.88) 
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For this we take the Legendre transform of <t> with respect to q : 

(C0)(q) = J ~ Q' r ) 

^ 3 qi - - 

= X! qiPi ~ Q' t} ■ 



We then have 

/ 

U(Q, p, t ) = -J^VkiQ, p, t)p k + 0(q(Q. p, t ), Q, t ) (2.89) 

A-l 



and obtain the equations 



dU dU 

qk = ~ a ' Pk = “aTT 

dpk 3 Qk 



H = H 



dU 

~dt~ 



(D) The choice 



(2.90) 



/ / 

m(p, p, t) = v{p, p, t)-J2 QkiqiP , p,t), p, t)Pk + ^2qk(P, p, t)pk ■ 

k = 1 k—\ 



This fourth possibility is obtained from S, for instance, by taking its Legendre 
transform with respect to q: 

d S 

(CS)(q) = y>- S(q, P, t) = J'qiPi - S , 

3 qt ~ 

so that 

/ 

V(P, p, t) = f ~^2,qk(P, P, t)pk + S(q(P, p, t), P, t) . 

A-l 



In this case one obtains the equations 



dV 



qk = - 



3pk 



Qk 



dV 
3Pk ’ 



H = H + 



dv 



(2.91) 



This classification of generating functions for canonical transformations may at 
first seem rather complicated. When written in this form, the general structure of 
canonical transformations is not transparent. In reality, the four types (A-D) are 
closely related and can be treated in a unified way. This is easy to understand if one 
realizes that generalized coordinates are in no way distinguished over generalized 
momenta and, in particular, that coordinates can be transformed into momenta and 
vice versa. In Sects. 2.25 and 2.27 below we shall introduce a unified formulation 
that clarifies these matters. Before doing this we consider two examples. 
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2.24 Examples of Canonical Transformations 

Example (i) Class B is distinguished from the others by the fact that it contains 
the identical mapping. In order to see this let 

/ 

S(q,P) = J2<l‘ P i ■ (2-92) 

;=1 

We confirm, indeed, from (2.87) that 
8S dS 

p ‘ = !^ = Pr - a - = JF j = “’'- H = H ■ 

Class A, on the other hand, contains that transformation which interchanges the 
role of coordinates and momenta. Indeed, taking 

/ 

*(q,Q) = Y,qkQk (2.93) 

jfe=i 



we find that (2.83) gives p, = Qi, P k = -q k , H(Q , P) = H(-P , Q). 



Example (ii) For the harmonic oscillator there is a simple canonical transforma- 
tion that makes Q cyclic. Start from 



H(q, p)=-?- + \marq 2 (/ = 1) 

1m 2 

and apply the canonical transformation generated by 

Q) — j mcoq 2 cot Q . 

In this case (2.83) are 
d& 

p — = mcoq cot Q , 

3 q 

3 0 1 mcoq 2 

3 Q 2 sin 2 Q 

or, by solving for q and p, 

q — J sin Q , p — V2mcoP cos Q , 



(2.94) 



(2.95) 



and, finally, H = coP . Thus, Q is cyclic, and we have 
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3 H 

P = = 0 -> P — a — const 

3 Q 

d h 

Q — — a> Q — cot + f3 . 

* dP 



(2.96) 



When translated back to the original coordinates this gives the familiar solution 



/ 2a 

q(t) — J sin (art + 0) . 

V mco 



As expected, the general solution depends on two integration constants whose in- 
terpretation is obvious: a determines the amplitude (it is assumed to be positive) 
and (i the phase of the oscillation. 

Whenever the new momentum P is a constant and the new coordinate Q a 
linear function of time, P is said to be an action variable, Q an angle variable. 
We return to action-angle variables below. 



2.25 The Structure of the Canonical Equations 



First, we consider a system with f — 1 . We assume that it is described by a 
Hamiltonian function H(q , p, t). As in Sect. 1.16 we set 



r = f ' q 



with x\ = q, X 2 = p 



as well as 6 



H* 









3*1 




dq 


3 H 




3 H 


V 3X2 / 




\ dp / 



, ■ def / 0 1 

and J = I _ 1 Q 



The canonical equations then take the form 
— J.V = // V 



(2.97a) 



(2.97b) 



(2.98) 



r = J H, 



(2.99) 



The second equation follows from the observation that J 1 = — J. Indeed, 

J 2 = -ll and J T = J — 1 = — J . (2.100) 

6 The derivative of H by xi is written as 1 1.x ^ ■ More generally, the set of all derivatives of H 
by £ is abbreviated by H x . 
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The solutions of (2.99) have the form 

x(t, s, y) = <P t ,s(y) with <p s , s (y) = y , 



(2.101) 



where s and y are the initial time and initial configuration respectively. 

For an arbitrary number / of degrees of freedom we have in a similar way 



(see also Sect. 1.18) 






< q\\ 

<72 




( dH/dq\ ) 


def 
X = 


9f 

Pi 


, V= f 


dH /dq f 
9/7/9/21 




P2 








\Pf) 




[dH/dpf) 



j d ± f ( °/x/ l/x/ N 

\-H/x/ 0/x/7 ' 



The canonical equations have again the form (2.98) or (2.99) with 



J = 



o in 

-i oi ’ 



( 2 . 102 ) 



(2.103) 



where 11 denotes the / x / unit matrix. Clearly, J has the same properties (2.100) 
as for / = 1. 



2.26 Example: Linear Autonomous Systems 
in One Dimension 



Before proceeding further we consider a simple example: the class of linear, au- 
tonomous systems with one degree of freedom. Linear means that x = A t, where 

A is a 2 x 2 matrix. Equation (2.98) now reads 

-Jx = — JA.y = H x . (2.104) 



This means in turn that H must have the general form 



19 9 19 9 

H — aq~ + 2bqp + cp ] = j [ax | + 2bx\X2 + cx 2 ] ■ 



Thus 



x = Ax = J H x 



/ 0 1\ / dH/dx x \ 
\-i o ){dH/dx 2 ) 



( bx i + cx 2 
—ax i — bx 2 



(2.105) 



(2.1059 



and the matrix A is given by 
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Note that its trace vanishes, TrA = 0. It is not difficult to solve (2. 105') directly 
in matrix form, viz. 



x = exp[(r - s)A]y = <p t - s (y ) . (2.106) 

The exponential is calculated by its series expansion. The square of A is propor- 
tional to the unit matrix, 




Therefore, all even powers of A are multiples of the unit matrix; all odd powers 
are multiples of A: 

A 2 ” = (-1 )M"H , A 2n+1 = (— 1)M"A . 



A = ac — b 2 is the determinant of A. For the sake of illustration we assume that 
A is positive. We then have (see also Sect. 2.20) 



exp{(r — s) A} = Icos 





(2.107) 



Thus, the solution (2.106) is obtained as follows, setting co *J~A — V 



b 2 : 



x = <Pr-s(y) = P (t -s)y 



( b c 

'cos a)(f — s) H sinft>(f — s) —sin co(t — s) 

co co 

a b 

sin co (r — s) cos&>(r — ,?) sin<w(f — .?) 

co co 



\ 



y ■ 



) 



It describes harmonic oscillations. (If, instead, we choose A < 0, it describes 
exponential behavior. ) The solution is a linear function of the initial configuration, 
x‘ = Y^k=i Pikit — s)y k from which we obtain d.v' = Y^k=i Pik&y k ■ The volume 
element dx l dx 2 in phase space is invariant if det(3x'/9V) = det (P,-*) = 1. This 
is indeed the case: 



det ( Pi k) = cos 2 co(t — s) — 




ac 

CO 2 



sin 2 co(t — s) = 1 



(Recall the remark at the end of Sect. 1.21.1.) This “conservation of phase volume” 
is sketched in Fig. 2.1 1 for the case a — c — I , /; = 0, i.e. the harmonic oscillator. 
In fact, this is nothing but the content of Liouville’s theorem to which we return 
below, in a more general context (Sects. 2.29 and 2.30). 
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Fig. 2.11. Phase portraits for the harmonic oscillator 
(units as in Sect. 1.17.1). The hatched area wanders 
about the origin with constant angular velocity and 
without changing its shape 



2.27 Canonical Transformations in Compact Notation 



The 2 /-dimensional phase space of a Hamiltonian system carries an interesting 
geometrical structure which is encoded in the canonical equations 



V*/ 



f dH/dp \ 
V-3 H/dq) 



and in the canonical transformations that leave these equations form invariant. This 
structure becomes apparent, for the first time, if the canonical transformations (A- 
D) and the conditions on their derivatives are formulated in the compact notation 
of Sect. 2.25. 

In Sect. 2.23 (2.84a) we saw that the condition 



det 



d 2 <P 
dqi 3 Qk 



#0 



(2.108a) 



had to be imposed on canonical transformations of class A. Only with this condition 
could the equation p, — d<t> /d q, be solved for Qk(q, p, 1). Similarly, in the other 
three cases we had the requirements 



det 



d 2 S 

dqidP k 



/0, det 



/ d 2 U 
\ 9 Qkdpi 



#0 



det 



/ d 2 V 

(dPkdpi 



0 . (2.108b) 



From (2.83) one reads off the conditions 



dpi _ d 2 0 _ dP k 

9 Qk dQkdq, dqi 



(2.109a) 
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Similarly, from (2.87), 

dpi = d 2 S = dQk 
dPk dPkdqi dqi 
From (2.90) of class C, 

dq t _ d 2 U _dP k 
dQk dQkdpi dpt 
and from (2.91) of class D, 

dgj = d 2 V = dQ k 

dP k - dPkdpi - dpi 

Returning to the compact notation of Sect. 2.25 we have 



-r = {qi . . . qf, pi... p f } and y = f { Q i . . . Qf, P\ . . . Pf) . 
Equations (2.109a-d) all contain derivatives of the form 

dx a def . . dya ,m ■— K Q , ^ r 

t , — = M a p , - — = (M ) a p , 

dyp dxp 



Clearly, one has 

2f dx a dy Y 



2 f 



S d yy d *P - ^ May(M l)yP - 8af> ' 



(2.109b) 



(2.109c) 



(2.109d) 



(2.110) 



(2.111) 



We now show that the diversity of (2.109) can be summarized as follows: 

2 / 2 / 

= EE JaiiJfivW ■ ( 2 . 112 ) 

/!= 1 V=1 



Taking account of the relation J 1 = — J this equation is written alternatively as 
-JM - (JM -1 ) 7 . 



We prove it by calculating its two sides separately. The left-hand side is 

_ JM = _( 0 K\(dq/dQ dq/dP\ = (-dp/dQ -dp/dP\ 
f-1 0 )\dp/dQ dp/dPj V dq/dQ dq/dPj 

while the right-hand side is 



(JM- 1 ) 1 



(dP/dq -d Q/dq\ 
\dP/d P -dQ/dp) 



Equations (2.109) tell us that these two matrices are in fact equal. Thus, (2.112) 
is proved. This equation is rewritten as follows. Writing out the transposition on 
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the right-hand side, we obtain JM = — (M 1 ) T J T = +(M 1 ) T J. Multiplying this 
equation with M T from the left, we see that M obeys the equation 



M t JM = J 



(2.113) 



no matter which type of canonical transformation is being studied. 

What is the significance of this equation? According to (2.111) and (2.109a- 
d), M is the matrix of second derivatives of generating functions for canonical 
transformations. The matrices M obeying (2.113) form a group that imprints a 
characteristic symmetry on phase space. The matrix J, on the other hand, turns 
out to be invariant under canonical transformations. For this reason, it plays the 
role of a metric in phase space. These statements are proved and analyzed in the 
next section. 



2.28 On the Symplectic Structure of Phase Space 

The set of all matrices that obey (2.113) form a group, the real symplectic group 

Sp 2 f(R) over the space ]R 2 ^. This is a group that is defined over a space with 

even dimension and that is characterized by a skew-symmetric, invariant hi I i near 

form. As a first step we shall verify that the M indeed form a group G. 

(1) There exists an operation that defines the composition of any two elements 
Mi and M 2 , Mi = MfM 2 . Obviously, this is matrix multiplication here. Mi is 
again an element of G, as one verifies by direct calculation: 

MjJM 3 = (MiM 2 ) t J(MiM 2 ) = M 2 (M}JMi )M 2 = J . 

(2) This operation is associative because matrix multiplication has this property. 

(3) There is a unit element in G: E = 1. Indeed, 11 T J 11 = J. 

(4) For every M e G there is an inverse given by 

M _1 = J“'M t J . 

This is verified as follows: 

(a) Equation (2.113) implies that (detM) 2 = 1, i.e. M is not singular and has an 
inverse. 

(b) J also belongs to G since J 1 JJ = J _1 JJ = J. 

(c) One now confirms that M 1 M = £: 

M *M = (J-'M t J)M = J -1 (M t JM) = J 1 J = 11 
and, finally, that M T also belongs to G: 

(M t ) t JM t = (MJ)M t = (MJHJM-'J- 1 ) = (MJ)(J _1 M _1 J) = J . 
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(In the second step we have taken M r from (2.1 13); in the third step we have 
used J -1 = — J twice.) 

Thus, we have proved that the matrices M that fulfill (2.113) form a group. 
The underlying space is the phase space R 2 f . 

There is a skew-symmetric bilinear form on this space that is invariant under 
transformations pertaining to G = Sp 2 1 and that can be understood as a generalized 
scalar product of vectors over M 2 f . For two arbitrary vectors x and y we define 

2 / 

[x, y] = x T Jy = £ Xi J ik y k . (2. 114) 

i,k= 1 

One can easily verify that this form is invariant. Let M e Sp 2 / and let x' — 
Mr, >■' = My. Then 

[x\ /] = [Mx, My] = x T M T JMy = x T Jy = [x, y] . 

The bilinear form has the following properties, which are read off (2.114). 

(i) It is skew-symmetric: 

[y,xi = -[x,y]. (2.115a) 



Proof. 

[y, x] = (x T J T y) T = — (x T Jy) T = -x T Jy = -[x, y] . (2.115b) 

□ 

(ii) It is linear in both arguments. For instance, 

[x, A-tyt + A 2 y 2 ] = ^i[x, yi] + k 2 [x, y 2 ] . (2.115c) 

If [x, y] = 0 for all y e M 2 ^ , then x = 0. This means that the form (2.114) is not 
degenerate. Thus, it has all properties that one expects for a scalar product. 

The symplectic group Sp 2 f is the symmetry group of R 2 ', together with the 
structure [x, y] (2.114), in the same way as 0(2/) is the symmetry group of the 
same space with the structure of the ordinary scalar product (x, y) — ^2 k J =l x k y k . 
Note, however, that the symplectic structure (as a nondegenerate form) is only 
defined for even dimension n — 2 /, while the Euclidean structure (x, y) is defined 
for both even and odd dimension and is nondegenerate in either case. 

Consider now 2/ vectors over R 2 2, x^f . . . , x^ 2 -^ that are assumed to be 
linearly independent. Then take the oriented volume of the parallelepiped spanned 
by these vectors: 



(x[ {) ■■■ 4 2/) \ 

( 1 ) ( 2 /) 
\X 2f ■■■ X 2 f ) 



[x ( '\x a \...,x { 2 fY= del 



(2.116) 
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Lemma. If nil), jt(2), . . . , denotes the permutation tt of the indices 

1,2, ... ,2 f and o{tx) its signature (i.e. a = +1 if it is even, a — —1 if it is 
odd), then 



[x^,.,.,x^] = ( l ^j 2] X ^a(7r)[x^,x^ 2 )][x^,x^ 4 )] 



(2.117) 



Proof. When written out explicitly, the right-hand side reads 



/ (_l)[//2] 

y r. 2 f 



E 

n\...ri2f 



Jn\ri2 ■ 




/) 

~ n 2 f 



The second factor of this expression is precisely the determinant (2.116) if 
{n i , . . . , n 2 f } is an even permutation of {1, 2, . ..,2/}: it is minus that determi- 
nant if it is an odd permutation. Denoting this permutation by tc' and its signature 
by nin'), this last expression is equal to 






/ (-D l//21 

V 



E ^'(D n'(2) ■ ■ ■ 



We now show that the factor in brackets equals 1, thus proving the lemma. This 
goes as follows. We know that J,j+ f = + 1, Jj+ f.j = — 1, while all other elements 
vanish. In calculating <y{7z)J 7t (i)n(2) ■ ■ ■ Jn(2 f-\)n(2 f) we have the following 
possibilities, 

(a) Ji h i x +fJi 2 ,i 2 +f ■ ■ ■ Jif.if+f with 1 < ifc < 2 /, all 4 . being different from each 
other. There are /! such products and they all have the value +1 because all 
of them are obtained from J\ j . . . Jf,2f by exchanging the indices pairwise. 
The signature is the same for all of them; call it o(ci). 

(b) Exchange now one pair of indices, i.e. Ji^+f ■ ■ ■ , . . . There 

are / x /! products of this type and they all have the value —1. They all have 
the same signature cr( 7 r) = crib) and o(b) = —a (a). 

(c) Exchange two pairs of indices to obtain 7,, (|+ / . . . . . . 

J, k+ f i k . . . j f +f. There are [/(/ — l)/2] x /! products of this class and 
their value is again +1. Their signature is cr(c) = o (a); etc. 

Thus, with the signature factor included, all terms contribute with the same 
sign and we obtain 

/![! + / + /(/— l )/2 +... + !] = f\2? . 



It remains to determine a (a). In Jij+i J 2 J +2 ...//, 2 / (which is +1) the order 
of the indices (1, / + 1, 2, / + 2, ...,/, 2/) is obtained from (1, 2, . . . /, / + 
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1, . . . , 2/) by (/ — 1) + (/ — 2) + . . . + 1 = /(/ — l)/2 exchanges of neighbors. 
Thus a(fl) = (_)/(/ -1 )/ 2 , As one easily convinces oneself this is the same as 
(_1)[//21. □ 

The lemma serves to prove the following proposition. 

Proposition. Every M pertaining to Spi j has determinant + 1 : 

if Me SpT f then det M = + 1 . 

Proof. By the product formula for determinants we have 

|M.y (1 \ . . . My (2 ^] = (det M)[x^, . . ..y (2 ^ x ] . 

As the vectors y xl) . . . y' 22) are linearly independent, their determinant does not 
vanish. Now, from the lemma (2.117), 

|M.y (1 \ . . . M.y* 2 ^] = [x (1 \ . . . ,x <2 ^] . 



We conclude that det M = + 1 . □ 

Remarks: In this section we have been talking about vectors on phase space P 
while until now x etc. were points of P. This was justified because we assumed the 
phase space to be K 2 ^ for which every tangent space can be identified with its base 
space. If P is not flat any more, but is a differentiable manifold, our description 
holds in local coordinate systems (also called charts). This is worked out in more 
detail in Chap. 5 



2.29 Liouville’s Theorem 

As in Sect. 1.19 we denote the solutions of Hamilton’s equations by 

<?,Ax) = (cp]jx), .... cp; f s (x)) . (2.118) 

Also as before we call @r,s ( •?) the flow in phase space. Indeed, if x denotes the 
initial configuration the system assumes at the initial time s, (2.1 18) describes how 
the system flows across phase space and goes over to the configuration y assumed 
at time t. The temporal evolution of a canonical system can be visualized as the 
flow of an incompressible fluid: the flow conserves volume and orientation. Given 
a set of initial configurations, which, at time s fill a certain oriented domain U s of 
phase space, this same ensemble will be found to lie in an oriented phase-space 
domain U t , at time t (later or earlier than s). in such a way that U s and LJ, have 
the same volume V s = V, and their orientation is the same. This is the content of 
Liouville’s theorem. 

In order to work out its significance we formulate and prove this theorem in 
two, equivalent ways. The first formulation consists in showing that the matrix 
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(2.119) of partial derivatives is symplectic. This matrix is precisely the Jacobian 
of the transformation d.y dy = (D<J>)dx. As it is symplectic, it has determinant 
+ 1. In the second formulation (which is equivalent to the first) we show that the 
flow has divergence zero, which means that there is no net flow out of U s nor into 

U s . 



2.29.1 The Local Form 



The matrix of partial derivatives of & being abbreviated by 



00 1 , six) 



def 



\ dXk J 



the theorem reads as follows. 



(2.119) 



Liouville’s theorem. Let 0 t , s ix ) be the flow of the differential equation 
Jt = H x . For all x. t and .v for which the flow is defined, 

00 s {x) e Sp 2/ . (2.120) 

The matrix of partial derivatives is symplectic and therefore has determinant 

+ 1 . 



Before we proceed to prove this theorem we wish to interpret the consequences 
of (2.120). The flux 0t. s (x) is a mapping that maps the point x (assumed by 
the system at time s) onto the point x, — 0, X (x) (assumed at time t). Suppose 
we consider neighboring initial conditions filling the volume element d.vi . . . dx 2 /. 
The statement (2.120) then tells us that this volume as well as its orientation is 
conserved under the flow. Indeed, the matrix (2.119) is nothing but the Jacobian 
of this mapping. 

Proof. We have — J[3<?r,s(-y)/9f] = H x (t) o <p t s . Taking the differential of the 
equation — J.i = H x by x, and using the chain rule, we obtain 
— J[3D<J r j ix)/dt] = (OH x ){0 ,t)O0, six) and finally 

0 

-[(D<Z> m (a-)) t J(D<Zv s (x))] = -(D$ m ) t [DW, - (D//,J t ](D^ m ) = 0 . 

dt 

( 2 . 121 ) 

This expression is zero because D H x — (i) 2 // /dxfdxj) is symmetric. Equation 
(2.120) is obvious for t = s. It then follows from (2.121) that it holds for all t. 
Thus, the theorem is proved. □ 

The following converse of Liouville’s theorem also holds. Let 0 t s be the flow 
of the differential equation — Ji = F(x, t ) and assume that it fulfills (2.120). Then 
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there exists locally a Hamiltonian function H(x, t) such that H x — F(x, t). This is 
seen as follows. The equation analogous to (2.121) now says that DF— (DF) t = 0, 
or that the curl of F vanishes: curl F = 0. If this is so, F can be written locally 
as a gradient: F — H x . 

2.29.2 The Global Form 

The statement of Liouville’s theorem can be made more transparent by the example 
of a set of initial conditions that fill a finite oriented domain U s whose volume is 
V s . At time s we have 

y, = [ dx , 

Ju s 

the integral being taken over the domain U s of phase space. At another time t we 
have 

V t = f d y — f dxdet(^)= [ dxdet(D0,. s ) , 

Ju t ' JUs V 9 ■?/ Ju s 

because in transforming an oriented multiple integral to new variables, the volume 
element is multiplied with the determinant of the corresponding Jacobi matrix. If 
we take t in the neighborhood of s we can expand in terms of (f — s): 

— x + Fix , t) ■ (t — s) + O ((t — s) 2 ) , where 



, fdH dH 

F(x,t) = Jfl.t = — , - — 

\ o p dq 

From the definition (2.119) the derivative by x is 

D 0,' S (x) = n + D Fix, t)-(t-s) + 0((r - 5 ) 2 ) , 



or, when written out explicitly, 



d&lAx) 9F ! ' , 

= S ik + -r (t-s) + 0((t - s) 2 ) . 

dx K dx K 

In taking the determinant, one makes use of the following formula: 

det( 11 + Ae) = det(5,^ + A/^e) = 1 + eTrA + 0(e 2 ) , 



where TrA = An denotes the trace of A and where s is to be identified with 
it — s). We obtain 



2 / 

det(Di>,. s (y)) = 1 + (r - s) ^ 

i= 1 



dF 1 

dx' 



+ Q iit-s) 2 ) . 
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The trace dF l /dx' is a divergence in the 2/-dimensional phase space. It is 

easy to see that it vanishes if F — J H x , viz. 



,. def v— ' dF' 

div F — > r 

^ dx l 
1 = 1 



9 / dH\ 
dq V dp ) 



±(- 3 JL ) =0 

dp V dq J 



This shows that V v = V,. As long as the flow is defined, the domain U s of 
initial conditions can change its position and its shape but not its volume or its 
orientation. 



2.30 Examples for the Use of Liouville’s Theorem 

Example (i) A particularly simple example is provided by the linear, autonomous 
system with f — 1 that we studied in Sect. 2.26. Here the action of the flow is 
simply multiplication of the initial configuration x by the matrix P(f — s ) whose 
determinant is +1. In the special case of the harmonic oscillator, for instance, all 
phase points move on circles around the origin, with constant and universal angular 
velocity. A given domain U s moves around the origin unchanged, like the hand of 
a clock. This is sketched in Figs. 2.11 and 2.12. 




Fig. 2.12. The harmonic oscillator. A circular domain 
of initial configurations wanders uniformly about the 
origin. In the units introduced in Sect. 1.17.1 the period 
is = 2 jt. The four positions shown here correspond 
to the times r = 0, 0.2r°, 0.4r°, and 0.75r° 



Example (ii) The example of the mathematical pendulum is somewhat less trivial. 
We note its equations of motion in the dimensionless form of Sect. 1.17.2, 



dzi 

dr 



= zi(r) , 



-j - 1 = - sin zi(U - 
dr 
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where r is the dimensionless time variable r — cot, while the reduced energy was 
defined to be 



s — E/mgl — + (1 - coszi) . 



The quantity s is constant along every phase portrait (the solution of the canonical 
equations). 

Figure 1.10 in Sect. 1.17.2 shows the phase portraits in phase space ( 21 , 22 ) 
(zi is the same as q , 22 is the same as p). For example, a disklike domain of 
initial configurations U s behaves under the flow as indicated in Figs. 2.13-15. In 
dimensionless units, the period of the harmonic oscillator is r (0) = = 2rc . 

The figures show three positions of the domain U, into which U s has moved at 
the times k ■ r (0) indicated in the captions. As the motion is periodic, one should 
think of these figures being glued on a cylinder of circumference 2n, such that the 
lines ( 7 r, p) and (— n, p) coincide. The deformation of the initial domain is clearly 
visible. It is particularly noticeable whenever one of the phase points moves along 
the separatrix (cf. Sect. 1.23 (1.59)). This happens for s — 2, i.e. for the initial 
condition (q — 0 , p = 2), for example. For large, positive time such a point 
wanders slowly towards the point (q — it, p — 0). Neighboring points with s > 2 
“swing through", while those with s < 2 “oscillate”, i.e. turn around the origin 
several times. The figures show very clearly that despite these deformations the 
original volume and orientation are preserved. 



-3 -2 -1 

Fig. 2.13. The mathematical pendulum. A circular do- 
main of initial configurations below the separatrix (as- . 0 

sumed at time r = 0) moves about the origin somewhat 
more slowly than for the case of the oscillator, Fig. 2.12. 

As the domain proceeds, it is more and more deformed. 

The positions shown here correspond to the times (in a 
clockwise direction): r = 0, 0.25r°, 0.5r°, and r°. (r° 
is the period of the harmonic oscillator that is obtained 
approximately for small amplitudes of the pendulum) -3 




<7 
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Fig. 2.14. Same as Fig. 2.13 but with the uppermost 
point of the circular domain now moving along the sep- 
aratrix. The successive positions in a clockwise direction 
shown are reached at r = 0, 0.2r°, 0.4r°, and 0.75r°. 
The arrows indicate the motion of the point with initial 
configuration (q = 0, p = 1). The open points show the 
motion of the center of the initial circle 



-3 



- 2 




3- 



P 




Fig. 2.15. Same system as in Figs. 2.13, 2.14. The center 
of the initial circular domain is now on the separatrix. 
Points on the separatrix approach the point (q = n, p = 
0) asymptotically, points below it move around the ori- 
gin, and points above “swing through”. The successive 
positions correspond in a clockwise direction to r = 0, 
0.1r°, 0.25 t°. and 0.5r° 




Q 



Example (iii) Charged particles in external electromagnetic fields obey the equa- 
tion of motion (2.29): 

e 

mr — -r x B + eE . 

c 

As we saw in the Example (ii) of Sect. 2.8, this equation follows from a Lagrangian 
function such as the one given in (2.31). We showed in Example (ii) of 2.16 that 




140 



2. The Principles of Canonical Mechanics 



the condition (2.44) for the existence of the Legendre transform of L is fulfilled. A 
Hamiltonian function describing this system is given by (2.51). For an ensemble 
of charged particles in external electric and magnetic fields we must also take 
into account the mutual Coulomb interaction between them. This, however, can 
be included in the Hamiltonian function. Therefore, a system of charged particles 
in external fields is canonical and obeys Liouville’s theorem. It is clear that this 
theorem’s guaranteeing the conservation of phase space volume plays a central role 
in the construction of accelerators and of beam lines for elementary particles. 



2.31 Poisson Brackets 



The Poisson bracket is a skew- symmetric bilinear of derivatives of two dynami- 
cal quantities with respect to coordinates and momenta. A dynamical quantity is 
any physically relevant function of generalized coordinates and momenta such as 
the kinetic energy, the Hamiltonian function, or the total angular momentum. Let 
g(q, p. t) be such a dynamical quantity. The Poisson bracket of g and the Hamil- 
tonian function appear in quite a natural way if we calculate the total time change 
of g along a physical orbit in phase space, as follows: 



dg 

d t 



dg ■A dg . dg . 

dl + ]^dqi qi ^3 Pi Pi 



Bg y-/3g BH_ _dg_ 8H_' \ 
df V dqt dpi dpi 3 qt ) 



3 g 

d t 



+ {H,g] . 



In the second step we have made use of the canonical equations (2.45). In the third 
step we introduced the bracket symbol { , } as shorthand for the sum in the second 
expression. The Poisson bracket of H and g describes the temporal evolution of the 
quantity g. Furthermore, as we shall discover below, the bracket {/, g} of any two 
quantities is preserved under canonical transformations. We also wish to mention 
that, both in form and content, the Poisson bracket finds its analog in quantum 
mechanics: the commutator. In quantum mechanics, dynamical quantities (which 
are also called observables) are represented by operators (more precisely by self- 
adjoint operators over Hilbert space). The commutator of two operators contains 
the information whether or not the corresponding observables can be measured 
simultaneously. Therefore, the Poisson bracket is not only an important notion of 
canonical mechanics but also reveals some of its underlying structure and hints at 
the relationship between classical mechanics and quantum mechanics. 

Let f(x ) and g(x) be two dynamical quantities, i.e. functions of coordinates 
and momenta, which are at least C 1 . (They may also depend explicitly on time. 
As this is of no importance for what follows here, we suppress this possible de- 
pendence.) Their Poisson bracket {/, g} is a scalar product of the type (2.114) and 
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is defined as follows 7 : 



/ 



, ... del ^ I df 3 g 

{f,§m = 



3/3 

3?/ 3p, / 



( 2 . 122 ) 



We also have 



{/,*}(*) = ~[f,x,g,x](x) 



( 9/_ _V 

\3<7; 3<27 3/?i 



(~\ 

3<?i 

3 g 

3 / \ / 0 11 \ ' dq f 

d Pf ) V _1 °/ 9g 
3/71 

3 g 

V dp/ J 



(2.123) 



This latter form reveals an important property of the Poisson bracket: it is invariant 
under canonical transformations. Let be such a transformation: 



L = (^i qf, Pi, ■■■, Pf) i — *^M = (QlC*), • ••,2/C*),A(.x), ...,Pf(x)). 

From Sects. 2.27 and 2.28 we know that D<P(x) e Sp 2 f- It is important to realize 
that <P maps the phase space onto itself, P : R 2 -^ -> R 2 ' , while / and g map the 
phase space R 2 ^ onto the real numbers R. / and g, in other words, are prescriptions 
how to form real functions of their arguments, taken from R 2 - 7 , such as f — q 2 , 
g — (q 2 + p 2 )/ 2, etc. These prescriptions can be applied to the old variables ( q , p) 
or, alternatively, to the new ones ( Q , P). We then have the following 



Proposition. For all /, g, and x 

{foP,go P\{x) = {/, g} o V(X) , (2.124) 

provided P(x) is a canonical transformation. In words: if one transforms 
the quantities / and g to the new variables and then takes their Poisson 
bracket, the result is the same as that obtained in transforming their original 
Poisson bracket to the new variables. 



7 We define the bracket such that it corresponds to the commutator [/, g] of quantum mechanics, 
without change of sign. 
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Proof. Take the derivatives 



^-(f o «*)(*) = 

OXi 



E 



k = i 



2/ 

dya 



y=v(x) 



dP k 

dxi 



or, in compact notation 

{foP), x = (DP) T {x)-f y {P{x)). 

By assumption, P is canonical, i.e. DP and (DP) 7 are sympletic. Therefore 
[(/ O P), x , (g O P), x ] = [(D P) T (x)fy(P(x)), (D P) T (x)g,y(P(.X))] 

= [f,y(#(x)), g , y mm = -{/, g} o p(x) . □ 

The proposition has the following corollary. 

Corollary. If (2.124) holds identically, or if the weaker condition 

{xi o P, Xk o P}{x) = {xi, Xk] o P (x) (2.125) 

holds true for all x and i,k, then P (x) is canonical. 



Proof. From the definition (2.122) we have {x; , x^) — — Jik ■ By assumption, this 
is invariant under the transformation P, i.e. 

{y m , y n }(x) = -l(OP) T (x)e m , (DP) T (x)e n ] — ~[e m ,e n ] - -J mn , 

where e m and e„ are unit vectors in K 2 ^ . Thus, 

(D</')J(Dtf') r = J • □ 

Note that (2.125), when written in terms of q , p. Q, and p, reads 
{Qi, QjKx) = { qi,qj}(x ) = 0 , 

{Pi,PjKx) = {pi, Pj }(x) = 0, (2.126) 

{Pi, Qj}{x) = { Pi , qj](x) = Sij . 

We close this section with the remark that canonical transformations can be char- 
acterized in four equivalent ways. The transformation P : (q . p) — > (Q. P) is 
canonical if 

(a) it leaves unchanged the canonical equations (2.45), or 

(b) it leaves invariant all Poisson brackets between dynamical quantities / and g, 
or 

(c) it just leaves invariant the set of Poisson brackets (2.126), or 

(d) the matrix of its derivatives is symplectic, DP e Sp^ . 
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2.32 Properties of Poisson Brackets 



It is possible to write the canonical equations (2.45) by means of Poisson brackets, 
in a more symmetric form. Indeed, as one may easily verify, they read 

kk = {H, q k } , p k = {H, p k ) (2.127) 



or in the compact notation of Sect. 2.27, 



2/ 

%k — [Hx, %k,x\ — 'y ' Jki 
i= 1 



dH 

dXj 



Let g(q, p. t) be a dynamical quantity, assumed to be at least C 1 in all its variables. 
As above, we calculate the total derivative of g with respect to time and make use 
of the canonical equations: 



d 
d t 



g(q, p- t ) 



9g 

dt 




9g . dg . 

~^—qk + T~Pk 

oq k ap k 



= ft + 



(2.128) 



This generalizes (2.127) to arbitrary dynamical quantities. If g is an integral of the 
motion, then 



dg 

-£ + {H,g} = 0, 
dt 

or, if g has no explicit time dependence. 



(2.129a) 



{#.«} = (>. 



(2.129b) 



Obviously, the Poisson bracket (2.122) has all the properties of the symplec- 
tic scalar product (2.115a, c). Besides these, it has the following properties. The 
bracket of g(q, p, t) with q k is equal to the derivative of g by p k , while its bracket 
with p k is minus its derivative by q k : 



{g. qk} 



dg 

9 p k ’ 



{g. Pk} = - 



9g 

9q k 



(2.130) 



Furthermore, for any three quantities u(q, p, t ), v(q, p, t ), and w(q, p , t) that are 
at least C 2 we can derive the following identity: 



the Jacobi identity {u, [v, w}} + [v, [w, m}} + [w, [u, u}} = 0 . (2.131) 



This important identity can be verified either by direct calculation, using the defini- 
tion (2. 122), or by expressing the brackets via (2. 123) in terms of the scalar product 

(2.1 14), as follows. For the sake of clarity we use the abbreviation du/dx l = Uj for 
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partial derivatives and, correspondingly, u/k for the second derivatives d 2 u /dx 1 dx k . 
We then have 

{u, {v, w}} = ~[u iX , {v, w}, x ] = +[u,x, [V' X , tu, *],*] 

2/ 2/ If 2/ 

= EEEE WiJik^/^X iVm Jmn ^n) • 

i= 1 k= 1 m=l n=l 

Thus, the left-hand side of (2.131) is given by 
{ u , { v , w}} + {u, {u;, w}} + { w , {w, u}} 

— ^ ' u iJikJmn(Vmk^n H" ^m^nk) H" E P/ Jik^mn (.WmkMn ”1“ W m M n k) 

ikmn ikmn 

+ J2 wt JikJmn (limkVn T“ U m Vnk) • 

ikmn 

The six terms of this sum are pairwise equal and opposite. For example, take the 
last term on the right-hand side and make the following replacement in the indices: 
m — > i, i -> n, k -* in, and n — > k. As these are summation indices, the value 
of the term is unchanged. Thus, it becomes w n J nm JikUi ifm ■ As Vkm — v m k, 
but J nm = —J,nn, it cancels the first term. In a similar fashion one sees that the 
second and third terms cancel, and similarly the fourth and the fifth. 

The Jacobi identity (2.131) is used to demonstrate the following assertion. 



Poisson’s theorem. The Poisson bracket of two integrals of the motion is 
again an integral of the motion. 



Proof. Let {u, u} = w. Then, from (2.128) 



d 9 

— w = — w + {H, w} 
at dt 

and by (2.131) 

d d 



— w — — {u, V 
dt dt 






— 


\ du v 
lap J 


■ + 


dv \ 
u, — > 

dt J 


= 




— 


du J 

dr ’ V J 


+ 


dnl 
u, — } 

l dr J 



— } - {u, {v, H}} - [v, {H, «}} 



dv 

u, — + {H, n) 

at 



(2.132) 

Therefore, if (dw/df) = 0 and (di>/df) = 0, then also (d/df){M, i>} = 0. □ 

Even if u, v are not conserved, (2.132) is an interesting result: the time derivative 
of the Poisson bracket obeys the product rule. 
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2.33 Infinitesimal Canonical Transformations 

Those canonical transformations which can be deformed continuously into the 
identity form a particularly important class. In this case one can construct canon- 
ical transformations that differ from the identity only infinitesimally. This means 
that one can study the local action of a canonical transformation - in close analogy 
to the case of the rotation group we studied in Sect. 2.22. 

We start from class B canonical transformations (2.85) (see Sect. 2.23) and 
from the identical mapping 

/ 

Se = j^qkPk ■ (?, P , H)» (Q = q,P = p,H = H ) (2.133) 

k= 1 

of Example (i) in Sect. 2.24. Let s be a parameter, taken to be infinitesimally small, 
and a(q , P) a differentiable function of old coordinates and new momenta. (For 
the moment we only consider transformations without explicit time dependence.) 
We set 

S(q, P, s) = S E + sa(q, p) + 0(e 2 ) . (2.134) 



The function 



v(q, P) 



ds 



ds 



e=0 



(2.135) 



is said to be the generating function of the infinitesimal transformation (2.134). 
From (2.87) we obtain 



Qi = 



dS 

~dPi 



— qi + e 



da 

dPi 



+ 0(e 2 ) , 



(2.136a) 



Pj = — = Pj +e 1 ~+ 0(e 2 ) . (2.136b) 

J dqj 3 qj 

Here the derivatives da/dP , , da/dqt depend on the old coordinates and on the 
new momenta. However, if we remain within the first order in the parameter s, 
then, for consistency, all Pj must be replaced by pj. (Pj differs from pj by terms 
of order s. If we kept it, we would in fact include some, but not all, terms of 
second order e 2 in (2.136)). From (2.136) we then have 

da (q, p) 

Sqi = Qi ~ qi = - - £ , (2.137a) 

3 Pi 



3 a(q,p) 

, P1 = P , - Pl = 



This can be written in a symmetric form, using (2.130), 



(2.137b) 
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Sqi — {a(q, p), qi}s , (2.138a) 



Spj = W{q, p), pj}e . (2.138b) 

The equations (2.138) have the following interpretation. The infinitesimal canonical 
transformation (2.134) shifts the generalized coordinate (momentum) proportion- 
ally to s and to the Poisson bracket of the generating function (2.135) and that 
coordinate (or momentum, respectively). A case of special interest is the following. 
Let 



S(q , P,s — dr) — Se + H(q, p)dt . (2.139) 

With dr replacing the parameter s, (2.137), or (2.138), are nothing but the canonical 
equations (taking Sq ,• = d q,, Spk = d pk): 

dq, — {H, qi}dt , dp j — {H, pj}dt . (2.140) 

Thus, the Hamiltonian function serves to “boost” the system: H is the generat- 
ing function for the infinitesimal canonical transformation that corresponds to the 
actual motion (dq, . dp j) of the system in the time interval dr. 



2.34 Integrals of the Motion 



We may wish to ask how a given dynamical quantity f{q,p,t) behaves under an 
infinitesimal transformation of the type (2.134). Formal calculation gives us the 
anwer: 



/ 

Safiq, p) = Y1 

k=l 

f 



3 qk 



Sqk 



3 Pk 



Spk 



\ dqk dpk dPk dqk) 

= {cr, f}£ ■ 



(2.141) 



For example, choosing s = dt, a — H, (2.141) yields with df/dt — 0 



if 

dt 



= {H, /} , 



(2.142) 



i.e. we recover (2.128) for the time change of /. In turn, we may ask how the 
Hamiltonian function H behaves under an infinitesimal canonical transformation 
generated by the function f(q, p). The answer is given in (2.141), viz. 



S f H = {/, H}s . 



(2.143) 




2.34 Integrals of the Motion 
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In particular, the vanishing of the bracket {/, H) means that H stays invariant 
under this transformation. If this is indeed the case, then, with {/,//} = —{H, /} 
and (2.142), we conclude that / is an integral of the motion. To work out more 
clearly this reciprocity we write (2.142) in the notation 

S H f = {H,f}dt (2.142') 



and compare it with (2.143). One sees that 8/H vanishes if and only if 8nf van- 
ishes. The infinitesimal canonical transformation generated by f(q, p) leaves the 
Hamiltonian function invariant if and only if / is constant along physical orbits. 

We note the close analogy to the Noether’s theorem for Lagrangian systems 
(Sect. 2.19). We wish to illustrate this by two examples. 

Example (i) Consider an n -particle system described by 



i=t 



'■» 



(2.144) 



that is invariant under translations in the direction a. Thus the canonical transfor- 
mation 



n 

i = 1 



a Y,P'i 



i=l 



(2.145) 



leaves H invariant, a being a constant vector whose modulus is a and which points 
in the direction d. (The unprimed variables r,-, pi are to be identified with the old 
variables qk, pk , while the primed ones are to be identified with Qk, Pk •) From 
(2.136) we have 



Qk — 



dS 

W k ’ 



which is r'. = r, + a here, with (k = 1 ,...,/ = 3 n), (i = 1 n), and 




which is pi — p'i . 



In fact it is sufficient to choose the modulus of the translation vector a infinitesi- 
mally small. The infinitesimal translation is then generated by 



cr 



dS 

da 



n 

= « • I >< • 

a=0 (=i 



As H is invariant, we have {a, H] — 0 and, from (2.142), da/dt = 0. We conclude 
that cr, the projection of total momentum onto the direction a, is an integral of the 
motion. 
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Example (ii) Assume now that the same system (2.144) is invariant under arbi- 
trary rotations of the coordinate system. If we consider an infinitesimal rotation 
characterized by <p — e<p, then, from (2.72), 



r\ = [H— (<p ■ J)]/7 + 0(e 2 ) 1 
p\ = [l-(?-J)] J p,'+0(e 2 )j 



(i — 1, . . . , n) . 



The generating function Sir -, . p' k ) is given by 



n n 

S = J)r ' ■ (2.146) 

1=1 1=1 

The notation is as follows: (<p ■ J) is a shorthand for the 3x3 matrix 



( <P ■ J )cib = e[^l(Jl)flfc + )ab + $3^3 )«*] • 



The second term on the right-hand side of (2.146) contains the scalar product of 
the vectors (<p • J)r,- and p,. First we verify that the generating function (2.146) 
does indeed describe a rotation. We have 

dS 

Qk = T7T ’ and lhus r i = r > - i 

on 

and 

9 S 

Pk= „ and thus P; = p\ - p\{<p ■ J) - 
oqk 

Ji being antisymmetric, the second equation becomes p, — | II + (</)■ J) |p'. If this 
is multiplied with [11 — ((p ■ J)] from the left we obtain the correct transformation 
rule p' = [H — {(p ■ J)]p,- up to the terms of second order in s (which must be 
omitted, for the sake of consistency). Equation (2.146) now yields 



a 



dS 



de 



£=0 



n 

• j )fi . 

1=1 



From (2.69) and (2.72) this can be expressed as the cross product of (p and r,-, 



(<P ■ J )r,- =q> xri , 



so that, finally 

Pi ■ (<p ■ J)r, = pi ■ (<p x r i) = (pin x =<p li ■ 

The integral of the motion is seen to be the projection of total angular momentum 
l — XX 1 onto the direction <p. As // was assumed to be invariant for all 
directions, we conclude that the whole vector / = {l\, I 2 , h) is conserved and that 



{H,l a } = 0, a = 1,2,3. 



(2.147) 
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2.35 The Hamilton-Jacobi Differential Equation 



As we saw in Sect. 2.23, the solution of the equations of motion of a canonical 
system becomes elementary if we succeed in making all coordinates cyclic ones. A 
special situation where this is obviously the case is met when //, the transformed 
Hamiltonian function, is zero. The question then is whether one can find a time- 
dependent, canonical transformation by which H vanishes, viz. 



{q, p, H (q , p, r)} 



S*(q.p,t) 



dS* 

Q, P, H = H -\ — — = 0 
at 



(2.148) 



Let us denote this special class of generating functions by S*(q, P,t). For H to 
vanish we obtain the requirement 



H = H 



qi , Pk 



dS* 

3 qk ’ 



r I + 



dS* 

~di 



= o . 



(2.149) 



This equation is called the differential equation of Hamilton and Jacobi. It is 
a partial differential equation, of first order in time, for the unknown function 
S* (q , a, t), where a = (ai, . . . , a/) are constants. Indeed, as H = 0, we have 
P k — 0, so that the new momenta are constants, P k — a k . Therefore, S* is a 
function of the (/+ 1) variables {q \, . . . , qp, t) and of the (constant) parameters 
(ori, . . . , a f). S*(q. a, t) is called the action function. 

The new coordinates Q k are also constants. They are given by 



Qk 



dS*(q,a,t) _ . 

a — Pk ■ 

da k 



(2.150) 



Equation (2.150) can be solved for 



qk = qk(u, p,t) (2.151) 

precisely if 

( d 2 S* \ 

det — — UO. (2.152) 

\oaicoqi / 

If the function S*(q, a, t) fulfills this condition it is said to be a complete solution 
of (2.149). In this sense, the partial differential equation (2.149) is equivalent to 
the system (2.45) of canonical equations. Equation (2.149) is an important topic 
in the theory of partial differential equations; its detailed discussion is beyond the 
scope of this book. 

If the Hamiltonian function does not depend explicitly on time, H(q, p) is 
constant along solution curves and is equal to the energy E. It is then sufficient 
to study the function 
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S{q,q) = S*(q,q,t)-Et, (2.153) 

called the reduced action. It obeys a time-independent partial-differential equation 
that follows from (2.149), viz. 



H 




= E . 



(2.154) 



This is known as the characteristic equation of Hamilton and Jacobi. 



2.36 Examples for the Use of the Hamilton- Jacobi Equation 



Example (i) Consider the motion of a free particle for which H = p 2 /2m. The 
Hamilton-Jacobi differential equation now reads 

— (V,.S*(r , a. t)) 2 + = 0 . 

2m at 

Its solution is easy to guess. It is 

a~ 

S*(r. a, t) — a • r — - — t + c . 

2m 



From (2.150) we obtain 



P = V a S* = r--t , 
m 

which is the expected solution r(t ) = (3 + at/m. ft and a are integration con- 
stants; they are seen to represent the initial position and momentum respectively. 
The solution as obtained from (2.149) reveals an interesting property. Let r{t) — 
r 2 (t), r 3 (t)). Then 

i as* 

n = , i = i,2,3. 

m dr / 



This means that the trajectories r(t) of the particle are everywhere perpendicular 
to the surfaces S*(r. a. 1 ) = const. The relation between these surfaces and the 
particle’s trajectories (which are orthogonal to them) receives a new interpretation 
in quantum mechanics. A quantum particle does not follow a classical trajectory. It 
is described by waves whose wave fronts are the analog of the surfaces S* = const. 



Example (ii) Consider the case of the Hamiltonian function H — p 2 /2m + U ( r ). 
In this case we turn directly to the reduced action function (2.153) for which 
(2.154) reads 

(V S) 2 + U (r) — E . 



(2.155) 
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As E — p 2 /2m + U(r) is constant along solutions, this equation reduces to 
(VS) 2 = p 2 . 

Its general solution can be written as an integral 

5= [ \p-dr) + S Q , (2.156) 

Jr 0 

provided the integral is taken along the trajectory with energy E. 



Remarks: 

(i) The generating function S*(q,a,t) is closely related to the action inte- 
gral (2.27). We assume that the Hamiltonian function H is such that the Legendre 
transformation between H and the Lagrangian function L exists. Taking the time 
derivative of S* and making use of (2.149), we find 



d S* 
d t 



dS* 



v- 3S* . 

= 

. 3 qi 



-H(q, p, t) + Pi q ‘ 

i 



p=ds*/dq 



As the variables p can be eliminated (by the assumption we made), the right-hand 
side of this equation can be read as the Lagrangian function L(q , q . t). Integrating 
over time from to to f, we have 



S*(q(t), a, t) = 



f 

Jin 



d t'L(q, q , t') . 



(2.157) 



In contrast to the general action integral (2.27), where q and q are independent, 
we must insert the solution curves into L in the integrand on the right-hand side 
of (2.157). Thus, the action integral, if it is taken for the physical solutions q(t), 
is the generating function for canonical transformations that “boost” the system 
from time to to time t. 

(ii) Consider the integral on the right-hand side of (2.157), taken between t\ and 
t2, and evaluated for the physical trajectory <P(t) which goes through the boundary 
values g at time t\, and b at time t2- This function is called Hamilton’s principal 
function. We assume that the Lagrangian does not depend explicitly on time. The 
principal function, which we denote by Io, then depends on the time difference 
x h — 1\ only, 



Iq = Io(a, b, r) = / dr L (0(r), . 

Jh 

It is instructive to compare this function with the action integral (2.27): In (2.27) 
I[q] is a smooth functional, q being an arbitrary smooth function of time that 
connects the boundary values given there. In contrast to this, Iq is calculated from 
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a solution <P(t) of the equations of motion which goes through the boundary values 
(?i, a) and (t 2 , b) and, hence, is a smooth function of g, b, and (?2 — ?i). 

We consider now a smooth change of the initial and final values of the (gen- 
eralized) coordinates and of the running time r. This means that we replace the 
solution (p(t ) by another solution tp(s , t ) which meets the following conditions. 
< p(s , t) is differentiable in the parameter s. For s — »• 0 it goes over into the orig- 
inal solution, (f>(s — 0, t) — V(t). During the time x' — r + Sx it runs from 
q' = a + Sg to // = b + 8b. How does Iq respond to these smooth changes? The 
answer is worked out in Exercise 2.30 and is as follows. 

Let p a and p b denote the values of the momenta canonically conjugate to q, 
at times t\ and respectively. One finds 



3 Iq 
dx 



= -E , 



8Iq 

dcii 



~Pi 




or, written as a variation, 



/ / 

<5/o = -E8x-J2p? Sa ‘ + Pk Sb k ■ 

i=l k= 1 



The function Iq and these results can be used to determine the nature of the ex- 
tremum (2.27): maximum, minimum, or saddle point. For that purpose we consider 
a set of neighboring physical trajectories which all go through the same initial posi- 
tions a but differ in their initial momenta p“ . We follow each one of these trajecto- 
ries over a fixed time r = t 2 ~t\ and compare the final positions as functions of the 
initial momenta b(p a ), that is we determine the partial derivatives — dbj/dp 
The inverse of this matrix M is the matrix of mixed second partial derivatives of 
Iq with respect to g and b, 

i 3 p? 3 2 /o 

(M~ l )ik = = — . 

db k d ai db k 

In general, we expect M to have maximal rank. Then its inverse exists and Iq is 
a minimum or maximum. However, for certain values of the running time r, it 
may happen that one or several of the bj remain unchanged by variations of the 
initial momenta. In this case the matrix M has rank smaller than maximal. Such 
final positions b for which M becomes singular, together with the corresponding 
initial positions g, are called conjugate points. If, in computing Iq, we happened 
to choose conjugate points for the boundary values, Iq is no longer a minimum 
(or maximum). 

A simple example is provided by force-free motion on S\, the surface of the 
sphere with radius R in K 3 . Obviously, the physical orbits are the great circles 
through the initial position q. If b is not the antipode of a, then there is a longest 
and a shortest arc of great circle joining q and b- If, however, a and b are antipodes 
then all trajectories starting from a with momenta //' which have the same absolute 
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value but different directions, reach b all at the same time. The point b is conjugate 
to q, Iq is a saddle point of the action integral (2.27). 

As a second example, let us study the one-dimensional harmonic oscillator. 
We use the reduced variables defined in Sect. 1.17.1. With (a, p a ) and (b, p h ) 
denoting the boundary values in phase space, r the running time from a to h, the 
corresponding solution of the equations of motion reads 

1 

(pit) — [a sinfe — t) + b sin(f — )] . 

sin r 



This trajectory is periodic. In the units used here the period is T = 2jt. The bound- 
ary values of the momentum are 



p a = (pit]) 



—a cos r + b 
sin r 



P b = <pih) 



—a + be os r 
sin r 



while the energy is given by 



1 r . 9 9 1 a 2 -\-b z — 2abc 

E=-\ fit) + fit) = — 

l L J 2 sin r 



In a similar fashion one calculates the Lagrangian L — \ \ fit) — ^ 2 (f)] as well 
as the function /q along the given trajectory 



f t2 ia 2 + b 2 ) cos r — 2 ab 

I 0 ia,b,x) = d t Li<pit),<p{t)) = — . 

J h 2 sin r 

One confirms that, indeed, 3/o/3r = —E, dio/da = —p a , dlo/db — p b . The 
matrix M, which in this example is one-dimensional, and its inverse are seen to 
be 



3 b 

M = = sin r , 

3 p a 



M~ l 



3 2 /q = 1 

3 b 3 a sin r 



M~ l becomes singular at r = n and at r = 27r, i.e, after half a period T /2 and 
after one full period T , respectively. 

Keeping the initial position a fixed, but varying the initial momentum p a , the 
final position is given by bip a , r) = //' sin r + a cos r. Expressed in terms of a, 
p a and r the integral 7 q becomes 



7o(fl, p a , r) = - sin r 



[((p a ) 2 — cos r — 2 ap a sin r j 



It is instructive to plot h{ p a , r) as a function of the running time r, for different 
values of the initial momentum p a . As long as 0 < r < n these curves do not 
intersect (except for the point a). When x = n they all meet in bip a , n) = —a, 
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independently of p a . At this point M~ l becomes singular. Thus, the points a and 
— a are conjugate points. As long as r stays smaller than it, the action integral I is 
a minimum. For r = 7t all trajectories with given initial position a, but different 
initial momenta p a , go through the point b = —a - as required by Hamilton’s 
principle. As Io(a,p a ,r — n) is always zero, the extremum of 7 is a saddle 
point. 



2.37 The Hamilton-Jacobi Equation and Integrable Systems 

There are several general methods of solving the Hamilton-Jacobi differential 
equation (2.149) for situations of practical interest (see e.g. Goldstein 1984). In- 
stead of going into these, we address the general question of the existence of local, 
or even global, solutions of the canonical equations. We shall discuss the class of 
completely integrable Hamiltonian systems and give a few examples. The general 
definition of angle and action variables is then followed by a short description of 
perturbation theory for quasiperiodic Hamiltonian systems, which is of relevance 
for celestial mechanics. 

2.37.1 Local Rectification of Hamiltonian Systems 

Locally the Hamilton-Jacobi equation (2.149) possesses complete solutions, i.e. in 
an neighborhood of an arbitrary point xo = (qo, po) of phase space one can always 
find a canonical transformation whose generating function S*(q, p. t ) obeys the 
condition (2.152), det(d 2 S* /dqidak) ^ 0, and which transforms the Hamiltonian 
function to H = 0. This follows, for example, from the explicit solution (2.157) 
or a generalization thereof, 

r(q,0 

S*(q(t),q,t)= S%(q 0 )+ / ' dt' L(q, q, t') . (2.158) 

J (qoM 

Here ,S' ( * (qo) is a function that represents a given initial condition for S* such that 
po = dSQ(q)/dq\q 0 . In the second term we have to insert the physical solution that 
connects {qo, to) with (q, t) and is obtained from the Euler-Lagrange equations 
(2.28). Finally, t and to must be close enough to each other that physical orbits 
q(t), which, at t = to , pass in a neighborhood of q o, do not intersect. (Note that we 
talk here about intersection of the graphs (t, q(t)).) This is the reason the existence 
of complete solutions is guaranteed only locally. Of course, this is no more than a 
statement of the existence of solutions for the equations of motion: it says nothing 
about their construction in practice. To find explicit solutions it may be equally 
difficult to solve the equations of motion (i.e. either the Euler-Lagrange equations 
or the canonical equations) or to find complete solutions of the Hamilton-Jacobi 
differential equation. However, without knowing the solutions explicitly, one can 
derive fairly general, interesting properties for the case of autonomous systems. We 
consider an autonomous Hamiltonian system, defined by the Hamiltonian function 
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H(q, p). H is chosen such that the condition det(3 2 H /dpidpk) ^ 0 is fulfilled, 
i.e. such that the Legendre transformation exists and is bijective. At first we note 
that instead of (2.153) we can choose the more general form 



S*(q, q, t) = S(q, q) - E(q)t , (2.159) 

where E (a) is an arbitrary differentiable function of the new momenta (which are 
conserved). Equation (2.154) is then replaced by 

H (?, = S(g) . (2.160) 

As we transform to the new coordinates (Q, p — a), with all Qj cyclic, (2.160) 
means that 



H(q) = H(q(Q , q), q) = E(q) . (2.160') 

For example, we could choose E (a) — a f — E, thus returning to (2. 154), with the 
prescription that Pf = a f be equal to the energy E . Without restriction of general- 
ity we assume that, locally, the derivative dH/dpf is not zero (otherwise one must 
reorder the phase-space variables). The equation H(q\ . . ,qp, p\ ■ ■ ■ Pf) = E can 
then be solved locally for pp, viz. 

Pf = -Hq\ . ..q f -\,qf, pi... p f -\, E) . 

Taking q f to be a formal time variable, r = q f, the function h can be understood to 
be the Hamiltonian function of a time-dependent system that has (/ — 1) degrees of 
freedom and depends on the constant E. Indeed, one can show that the following 
canonical equations of motion hold true: 

dq ; dh dp : dh 

— = , — = , for i = 1,2, ...,/- 1 . 

dr d pi dr dqi 

To see this, take the derivative of the equation 

H{qi ■ ..qq- 1, r; p x . ..pf- 1, -h(q\ . ..q f - 1, r; p\ . . . Pf-u E)) = E 

with respect to /?, , with i = 1,2,...,/— 1, 

dH d H d p f 

+ — = 0 . 

3 Pi dp f d pi 

However, dH/dpi = <y ( , dH/dpq — qf , and dpq/dpi = —dh/dpi, and hence 
dqt/dr = qi/qq = dli/dp,. In a similar fashion, taking the derivative with respect 
to qi, one obtains 
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from which the second canonical equation is obtained, with h the Hamiltonian 
function. The Hamilton-Jacobi differential equation for this formally time-depend- 
ent system 



dS* 

dr 



h 




dS* 



dS* 



■■ q f-ur; — , S) =0 

dq\ dc/f i 



locally always possesses a complete integral S*(q \ . . . qf-i, oc \ . . .a f-\, 17, r). S* 
being a complete solution means that 



det 



/ d 2 S* \ 
\dqjdai ) 



#0 (i,j = 1,2,...,/— 1) 



Assuming that 27(a) in (2.160) depends explicitly on ay, one can show that the 
above condition is fulfilled also for i and j running through 1 to / (hint: take the 
derivative of (2.160) by a/). This then proves the following rectification theorem 
for autonomous Hamiltonian systems. 



Rectification Theorem. Let (q . p ) be a point of phase space where not all 
of the derivatives dH/dq, and dH /dp j vanish, 

(dH d H\ 

(In other words, this point should not be an equilibrium position of the 
system.) Then the reduced equation (2.160) locally has a complete integral 
S(q, a), i.e. condition (2.152) is fulfilled. 



The new coordinates Qj are cyclic and are given by Q, — dS*/daj. Their time 
derivatives follow from (2.160') and the canonical equations. They are 

• dH 9r( ^ 

Qi = — = ■ 

9a,- 9a/ 

With the special choice H(a) = af = E, for instance, we obtain 
Qi = 0 for i = 1,2, ...,/- 1 

Qf = 1 (2-161) 

P k — 0 for k — 1,2,...,/ 
and therefore 

dS*(q,a ) 

da f 



Qi = ^ = const. , i = l,2, / - 1; Qf — t - t 0 = 

P k = a k — const, k — 1,2,...,/. 
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The significance of this theorem is the following: the flow of an autonomous sys- 
tem can be rectified as shown in Fig. 2.16, in the neighborhood of every point of 
phase space that is not an equilibrium position. Viewed locally, a transformation of 
phase space variables smoothes the flux to a uniform, rectilinear flow, (e.g.) paral- 
lel to the Q f -axis. Outside their equilibrium positions all autonomous Hamiltonian 
systems are locally equivalent 8 . Therefore, interesting properties specific to a given 
Hamiltonian (or more general) dynamical systems concern the global structure of 
its flow and its equilibrium positions. We shall return to these questions in Chap. 6. 




Fig. 2.16. Locally and outside of an equi- 
librium position a dynamical system can 
be rectified 



Example. The harmonic oscillator in one dimension. We shall study the harmonic 
oscillator using the reduced variables defined in Sect. 1.17.1. For the sake of clarity 
we write q instead of z\, p instead of 7 , 2 , and t instead of r. (Thus q = z\ and 
p = Z 2 carry the dimension (energy) 1 / 2 , while t is measured in units of w~ l .) In 
these units H = (p 2 + q 2 )/ 2. Choosing the function on the right-hand side of 

(2.159) as follows: E (a) — P > 0, the corresponding Hamilton-Jacobi equation 

(2.160) reads 



1 /d S 

2 Vs7 



2 



+ 2 r 



= P . 



Its integration is straightforward. Because dS/dq = \/2P — q 2 . 



S(q , P) = j^2P -q’ 2 dq' with l^l < V2P . 

We have 

-^- S - = 1 ^0 
dqdP J 2 P - q 2 

8 This is a special case of the more general rectification theorem for general, autonomous, differ- 
entiable systems: in the neighborhood of any point *o that is not an equilibrium position (i.e. 
where F{x q) ^ 0), the system x = F(x) of first-order differential equations can be transformed 
to the form z = (1,0, . . . , 0), i.e. z\ = 1, Z 2 = 0 = . . . = if - For a proof see e.g. Amol’d 
(1973). 
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and 



P = 




„ dS H 1 , , . q 

Q — — = / , = dq = arcsm , . 

3 P Jo V2 P ~ q' 2 V 2 P 

Because of the arcsm function, Q should be restricted to the interval (~f, §)■ 
However, solving for q and p one obtains 



q = *J2 P sin Q , p — V2 P cos Q , 



so that this restriction can be dropped. It is easy to confirm that the transformation 
(</, p) i-> (Q, P) is canonical, e.g. by verifying that M = d(q, p)/d(Q, P) is sym- 
plectic, or else that (PdQ — pdq) is a total differential given by d(P sin Q cos Q). 
Of course, the result is already known to us from Example (ii) of Sect. 2.24. In 
the present case the rectification is even a global one, cf. Fig. 2.17. With units as 
chosen here, the phase point runs along circles with radius V2 P in the (q . pi- 
plane, with angular velocity 1. In the ( Q , P) -plane the same point moves with 
uniform velocity 1 along a straight line parallel to the 0-axis. As the frequency 
is independent of the amplitude, the velocities on all phase orbits are the same in 
either representation (this is typical for the harmonic oscillator). 




p 










Q 



Fig. 2.17. For an oscillator the rec- 
tification is global 



2.37.2 Integrable Systems 

Mechanical systems that can be integrated completely and globally are the excep- 
tion in the many varieties of dynamical systems. In this section we wish to collect 
a few general properties and propositions and to give some examples of integrable 
systems. 

The chances of finding complete solutions for a given system, loosely speaking, 
are the greater the more integrals of the motion are known. 
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Example (i) Motion of a particle in one dimension, under the influence of a po- 
tential U(q) (see Sect. 1.16). The system has one degree of freedom f — 1, the 
dimension of phase space is dim P = 2, and there is one integral of the motion: 
that of the energy. 

Example (ii) Motion of a particle in three dimensions, with a central potential 
U (r) (see Sect. 1.24). Here f — 3 and dim P = 6. Integrals of the motion are 
provided by the energy E, the three components /,• of angular momentum, and, as 
a consequence, the square of angular momentum l 2 . 

Generally, the dynamical quantities g 2 (q, p), , gm(q , p) are integrals of the 
motion if the Poisson brackets of the Hamiltonian function H and gj vanish, 
{H.g,} = 0, for i = 2, 3, ... , m. Each one of these functions gj (q . p) may 
serve as the generating function for an infinitesimal canonical transformation (cf. 
Sect. 2.33). By the reciprocity discussed in Sect. 2.33, H is left invariant by this 
transformation. The question remains, however, in which way the other integrals of 
the motion transform under the infinitesimal transformation generated by a specific 
gi. In Example (ii) above, / 3 generates an infinitesimal rotation about the 3-axis, 
and we have 

[h,H} = 0, {/ 3 ,/ 2 } = 0, {Z 3 ,/i} = -/2, {/3,Z 2 } = /i. 

In other words, while the values of the energy E and the modulus of the angular 
momentum / = V l 2 are invariant, the rotation about the 3-axis changes the values 
of / 1 and I 2 . A solution with fixed values of ( E , l 2 , / 3 , l\, I 2 ) becomes a solution 
with the values (E, l 2 , / 3 , l' x —l\— eh, 1 ' 2 — h + e/i ) ■ 

Thus, there are integrals of the motion that “commute” (i.e. whose Poisson 
bracket {g;, gk | vanishes), as well as others that do not. These two groups must be 
distinguished because only the former is relevant for the question of integrability. 
This leads us to the following. 

Definition. The linearly independent dynamical quantities gi(q, p) = H(q,p), 
gliq, p), ■ • • , gm(q, p) are said to be in involution if the Poisson bracket for any 
pair of them vanishes, 

{gi(q, p),gk(q, p)} = 0 , i, k = 1 , 2, . . . , m . (2.162) 

In Example (ii) above, //, l 2 , and / 3 (or any other fixed component lj) are in 
involution. Let us consider a few more examples. 

Example (iii) Among the ten integrals of the motion of the two-body system with 
central force (cf. Sect. 1.12), the following six are in involution 

p 2 2 

Hrei — g— + U (r) , P , l 2 , 13 , (2.163) 

2/x 

H re 1 being the energy of the relative motion, P the momentum of the center of 
mass, and l the relative angular momentum. 
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Example (iv) (This anticipates Chap. 3). In the case of a force-free rigid body 
(which has / = 6 ), the kinetic energy H re i = a> ■ L/2, the momentum of the 
center of mass P, and lr and L 3 are in involution (cf. Sect. 3.13). 

All quoted examples are globally integrable (in fact, they are integrable by 
quadratures only). Their striking common feature is that the number of integrals 
of the motion equals the number, /, of degrees of freedom. For instance, the two- 
body problem of Example (iii) has f — 6 and possesses the six integrals (2.163) 
in involution. If we consider the three-body system with central forces instead, 
the number of degrees of freedom is / = 9, while the number of integrals of 
the motion that are in involution remains the same as in the case of two bodies, 
namely 6 . Indeed, the three-body problem is not generally integrable. 

Example (v) If, in turn, we manage to integrate a canonical system by means of 
the Hamilton-Jacobi differential equation (2.149), we obtain the / integrals of the 
motion (2.150): Qk — S*(q , a, t)/doik, k — 1,2,...,/, which trivially have the 
property {£>, , Qk\ = 0 . 

In conclusion, it seems as though the existence of / independent integrals of 
the motion is sufficient to render the system of 2 / canonical equations integrable. 
These matters are clarified by the following theorem of Liouville. 

Theorem on Integrable Systems. Let {gi = H, g 2 , . . . , gf] be dynamical quan- 
tities defined on the 2 /-dimensional phase space P of an autonomous, canonical 
system described by the Hamiltonian function H . The g/ (x ) are assumed to be in 
involution, 



{g/,g*} = 0, (2.164) 

and to be independent in the following sense: at each point of the hypersurface 

S = [xe¥\g i (x) = c i , * = 1 /} (2.165) 

the differentials dgi, . . . , dgf are linearly independent. Then: 

(a) S is a smooth hypersurface that stays invariant under the flow corresponding 
to H . If, in addition, S is compact and connected, then it can be mapped 
diffeomorphically onto an /-dimensional torus 

T f = S l x . . . x 5 1 (/ factors) . (2.165') 

(Here S l is the circle with radius 1, cf. also Sect. 5.2.3, Example (iii) below). 

(b) Every S’ 1 can be described by means of an angle coordinate 9i e [0, 27r). The 
most general motion on S is a quasiperiodic motion, which is a solution of 
the transformed equations of motion 



*' = 1 . ••../■ 



(2.166) 
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(c) The canonical equations can be solved by quadratures (i.e. by ordinary inte- 
gration). 

The proof is clearest if one makes use of the elegant tools of Chap. 5. As the reader 
is probably not yet familiar with them at this point, we skip the proof and refer to 
Arnol’d (1988, Sect. 49) where it is given in quite some detail. A motion 0 in P is 
said to be quasiperiodic, with base frequencies a> (1 \ . . ., u>^\ if all components 
of 0(t, s, y ) are periodic (the periods being 2tt / co^) and if these frequencies are 
rationally independent, i.e. with r, e Z, we have 

/ 

Y, = 0 only if r, = . . . = r f — 0 . (2.167) 

;=i 

Let us study two more examples. 

Example (vi) Two coupled linear oscillators (cf. Practical Example 2.1). Here / = 
2, the Hamiltonian function being given by 

H = -*-( p\ + p\) + + q\) + \mco\ {q\ - q 2 ) 2 ■ 

2 m 2 2 

The following are two integrals of the motion in involution related to H by 
gi + 82 = H: 

gl = T~(Pl + P 2 ) 2 + -,ma:>l(q\ + q 2 ) 2 , 

Am 4 

g 2 = T~(pi - P 2) 2 + 7'n(o>0 + 2w t)(<7l - ? 2) 2 , 

4 m 4 

This decomposition of H corresponds to the transformation to the two normal- 
mode oscillations of the system, zi/2 = (<? 1 ±^ 2 )/\/ 2 , gi and g 2 being the energies 
of these decoupled oscillations. Following Example (ii) of Sect. 2.24, we introduce 
new canonical coordinates {Q, — 0 , , Pj = I y} such that 

H = gi + g 2 = co 0 h + ^5 + 2coj I 2 . 



Then 6\ = coot + Pi, O 2 = J coq + 2 co^t + ^ 2 - For fixed values of I\ and I 2 
the surface S (2.165) is the torus T 2 . If the two frequencies = a>o, <w (2) = 
Jco q + 2 co 2 are rationally dependent, i.e. if mco^ = n 2 a>^ with n \ , n 2 positive 

integers, then the motion is periodic with period T — 2jr/n\a> (l) — 2n / n 2 w < ' 2) . 
Any orbit on the torus T 2 closes. If, on the contrary, the frequencies are rationally 
independent, the orbits never close. In this case any orbit is dense on the torus. 
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-mg e 3 



Fig. 2.18. Coordinates used to describe the spherical pendulum 



Example (vii) The spherical mathematical pendulum. Let R be the length of the 
pendulum, 0 the deviation from the vertical, and </> the azimuth in the horizontal 
plane (see Fig. 2.18). We have 



H = 



Pe 






2m R 2 2m R 2 sin 2 6 



+ mgR(l — cos 9) 



where pg — mR 2 dd/dt, p $ = mR 2 sin 2 ddcp/dt. The coordinate cp is cyclic. 
Hence, p,p = l = const. There are two integrals of the motion gi = H, g 2 — p 0 , 
and we can verify that they are in involution, {gj, g 2 l = 0. Therefore, according 
to the theorem above, the system is completely integrable by quadratures. Indeed, 
taking q\ — 9. qi — <p, r — cot, pi — dqi/dx, and pi = sin 2 q\dqi/dT, and 
introducing the parameters 

= E 2 = 1. 2 = /2 
mgR R m 2 gR 3 

we obtain 

£ = Ip 2 i + ~ 7~i — + (! — cos q\) = ]-p\ + U(qi) . 

£ 2 sin 2* 



The equations of motion read 



dgi 

dr 



= Pi = ±^2(e - U(qi)) , 



dpi 

dr 

dq2 

dr 



a cosq\ 
sin 3 q i 



— smr/i 



sin 2 < 7 i 



They are completely integrable. From the first equation we obtain 



r = 



/ 



dffi 

V 2 (£ - U (q \ )) ' 
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Combining the first and the third yields 



q 2 = a 



I 



dqi 

sin 2 q\s/2(s - U(q\ )) 



2.37.3 Angle and Action Variables 



Suppose we are given an autonomous Hamiltonian system with (for the moment) 
f — 1 that has periodic solutions for energies E belonging to a certain interval 
[Z?o, E i]. Let Ef be a periodic orbit with energy E. Then the period T ( E ) of the 
orbit r 'e is equal to the derivative dF(E)/dE of the surface F(E) that is enclosed 
by this orbit in phase space (see Exercises 2.1 and 2.27), 



T{E) = 



d 

d E 




d F(E) 
d E 



The period T(E) being related to the circular frequency by co(E) — 2n /T(E) we 
define the quantity 



1(E) = d- F(E) = 
2n 




(2.168) 



1(E) is called the action variable. Except for equilibrium positions, T(E) = 
2n dl (E) / dE is nonzero. Hence, the inverse function E — E(I) exists. Therefore, 
it is meaningful to construct a canonical transformation { q , p] — > { 9 , I } such that 
the transformed Hamiltonian function is just E(I) and 1 is the new momentum. 
From (2.154) and (2.87) this means that 



P = 



dS(q, I) 
3 q 



3 S(q, I) 

~ 37 



/ 3S\ 



(2.169) 



The new generalized coordinate 9 is called the angle variable. We then have I — 
const e A, where the interval A follows from the interval [ Eq . E\ ] for E. The 
equation of motion for 9 takes the simple form 



9 = 



mj) 

3/ 



= o) (/) = const . 



With the (Q = 9, P = 7) description of phase space, the orbits lie in a strip 
parallel to the 9 axis, whose width is A. Each periodic orbit has the representation 
(9 = m ) I )t+9v>, I — const), i.e. in the new variables it runs parallel to the abscissa. 
However, as 9 is to be understood modulo 2it , the phase space is bent to form part 
of a cylinder with radius 1 and height A. The periodic orbits lie on the manifold 
A x 5 1 in P. 

For a system with more than one degree of freedom, / > 1 , for which there are 
/ integrals in involution, the angle variables are taken to be the angular coordinates 
that describe the torus (2. 1 65'). The corresponding action variables are 
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h(c \, . . . , cf) — 




Y2 p i dc/ i ’ 



where one integrates over the curve in P that is the image of ( 9i — const for i ^ k, 
9k € .S' 1 ) . The manifold on which the motion takes place then has the form 

A i x ... A f x (S 1 — A\ x ...x A f x T 1 . (2.170) 



Example (vi) illustrates the case / = 2 for two decoupled oscillators. In Example 
(vii) the quantities s (energy) and a (azimuthal angular momentum) are constants 
of the motion, and we have 



1 

h(e,a) = — d 


f pi d< p = ^ V 2 ( £ - u (91)) d ?i 


1 

hie, a) = — j 


£ P2 di/2 = a . 



Solving the first equation for s, s — s{I\,a), we obtain the frequency for the 
motion in 9 (the deviation from the vertical), 



. 3e(7i, a) 

e= ^ir 



= CO 1 . 



2.38 Perturbing Quasiperiodic Hamiltonian Systems 

The theory of perturbations of integrable quasiperiodic Hamiltonian systems is 
obviously fundamental for celestial mechanics and for Hamiltonian dynamics in 
general. This is an important and extensive branch of mathematics that we cannot 
deal with in detail for lack of space. We can only sketch the basic questions ad- 
dressed in perturbation theory and must refer to the literature for a more adequate 
account. 

Consider an autonomous, integrable system for which there is a set of action- 
angle variables. Let the system be described by Hq(I). We now add to it a small 
Hamiltonian perturbation so that the Hamiltonian function of the perturbed system 
reads 

H(9, /, /r) = H 0 (J) + nHi(S, !, At) • (2.171) 

Here Hi is assumed to be 27r -periodic in the angle variables 9, while At is a real 
parameter that controls the strength of the perturbation. 

To quote an example, let us consider the restricted three-body problem, which 
is defined as follows. Two mass points Pi and P 2 whose masses are m\ and m 2 , 
respectively, move on circular orbits about their center of mass, under the action 
of gravitation. A third mass point P is added that moves in the orbit plane of 
Pi and P 2 , and whose mass is negligible compared to m\ and m 2 so that it does 
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not perturb the motion of the original two-body system. The problem consists in 
finding the motion of P . Obviously, this is a model for the motion of the moon in 
the field of the sun and the earth, of the motion of an asteroid with respect to the 
system of the sun and Jupiter (the heaviest of the planets in our planetary system), 
or of the motion of satellites in the neighborhood of the earth and the moon. 

Thus, the general problem is defined as follows: 



(a) H(0, /, /x) is a real analytic function of 9 e T* , of / e Ai x . . . x Af, as in 
(2.170), and of fi e I C R, where the interval I includes the origin. 

(b) H is periodic in the variables 6/ , i.e. 

H(6 + 2jzej, /, |U.) = H(6, /, /x) , i = 1, 2, . . . , / , 



where e, is the / th unit vector. 

(c) For /x = 0 the problem has a form that is integrable directly and completely. 
The condition det(3 2 H/dI k dIj) ^ 0 holds. The unperturbed solutions read 






,(0) 



34 



/ ; (0 W 0) 



i = 1,2,...,/ 



with 



„( 0 ) 



e A, 



(2.172) 



The aim of perturbation theory is to construct solutions of the perturbed system 
for small values of /x. We assumed H to be real and analytic in //. Therefore, any 
solution (2.172) can be continued in any finite time interval I, and for small values 
of [A with, say, |/x| < /xo(4)> where /x o is suitably chosen and is a function of 
the interval I, . Unfortunately, the question that is of real physical interest is much 
more difficult: it is the question whether there exist solutions of the perturbed 
system that are defined for all times. Only if one succeeds in constructing such 
solutions is there a chance to decide, for instance, whether the periodic motion of 
our planetary system is stable at large time scales. In fact, this question still has 
no final answer 9 . 

Perturbation theory makes use of two basic ideas. The first is to do a systematic 
expansion in terms of the parameter fi and to solve the equations generated in this 
way, order by order. Let 



Ok = o ( k 0) + + ^-ojr ’ + . . . ; e[ 0) = w k t + , 

/ t = /| 0) + < + A (21 + ...; 4 (0) = oik , 



(2.173) 



and then insert these expansions in the canonical equations, 



e k = {H,e k }, i k = {H,i k ], 

and compare terms of the same order (i n . For instance, at first order /z 1 one finds 

^ There is evidence, from numerical studies, that the motion of the planet Pluto is chaotic, i.e. that 
it is intrinsically unstable over large time scales (GJ. Sussman and J. Wisdom, Science 241(1988) 
433). Because Pluto couples to the other planets, though weakly, this irregular behavior eventually 
spreads to the whole system. 
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e ( k l) = {H u e^} 



dH[ ( 0 (O \ / (0) ) 



9/j 



(0) 






9//i(6> (0) ,/ (0) ) 

9p> 



(2.174) 



We have to insert the unperturbed solutions 0 (O) and 7 <0) on the right-hand side, 
for consistency, because otherwise there would appear terms of higher order in /i. 
As Hi was assumed to be periodic, it can be written as a Fourier series, 



Hi(9 (0 \l {0) ) = J2 C mi ...m / («)exp 



m\...m f 



f 



i ^2 m k0 l 



(0) 



k= 1 



= ^2 C„ n ... mf (a) exp i ^ m k (co k t + /S*.-)! . 



Equations (2.174) can then be integrated. The solutions contain terms whose time 
dependence is given by 



expfiYW} ■ 

2^ m k u> k 1 ^ I 

Such terms will remain small, for small perturbations, unless their denominator 
vanishes. If, in turn, = 0, 0 (1) and I il) will grow linearly in time. This 

kind of perturbation is said to be a secular perturbation. 

The simplest case is the one where the frequencies co k are rationally indepen- 
dent, cf. (2.167). The time average of a continuous function F over the quasiperi- 
odic flow 0 (( 9(r) — cot + p is equal to the space average of F on the torus 7’-' 10 , 



1 f T 

lim-/ F(6(t))dt = 
1 Jo 



1 

(2 7T)f 




d0i ...d 6 f F(G) = (F) . 



(2.175) 



Taking account of the secular term alone, one then obtains from (2.174) the ap- 
proximate equations 






dl, 



(0) 






;d) 



= o 



(2.176) 



The second idea is to transform the initial system (2.171) by means of successive 
canonical transformations in such a way that the transformed Hamiltonian function 
H depends only on the action variables /, up to terms of increasingly high order in 
the parameter pc. This program requires a detailed discussion and needs advanced 
and refined tools of analysis. Here we can only quote the main result, which is 
relevant for questions of stability of Hamiltonian systems. 

10 Equation (2.175) holds for functions f k = exp{i kjOj (t )} with 8j(t) = o>i i + /(■ , where it gives, 
in fact, (f k ) = 0, except for k\ = ... = kf = 0. Any continuous F can be approximated by a 
finite linear combination F = V ( k j\ . 
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2.39 Autonomous, Nondegenerate Hamiltonian Systems 
in the Neighborhood of Integrable Systems 

The manifold on which the motions of an autonomous integrable system Hq(I) take 
place is the one given in (2.170). We assume that the frequencies {&>,-} are rationally 
independent (see (2.167)). For fixed values of the action variables /*■ = at every 
solution curve runs around the torus T* and covers it densely. One says that the 
quasiperiodic motion is ergodic. After a sufficiently long time the orbit returns 
to an arbitrarily small neighborhood of its starting point but does not close. This 
situation is decribed by the term nonresonant torus 11 . 

We now add a small Hamiltonian perturbation to this system so that it is de- 
scribed by 

H(g, /, At) = H 0 (I) + nHx{0, /, At) . (2.177) 

The question then is in which sense this system is stable. Does the perturbation 
modify only slightly the manifold of motions of the system Hq(I), or does it de- 
stroy it completely? 

The most important result that to a large extent answers this question is pro- 
vided by a theorem of Kolmogorov, Arnol’d and Moser that we wish to state here 
without proof in admittedly somewhat qualitative terms. 



Theorem (KAM). If the frequencies of an integrable, Hamiltonian sys- 
tem Hq are rationally independent and if, in addition, these frequencies 
are sufficiently irrational, then, for small values of At, the perturbed system 
H — fl() + nll\ has solutions that are predominantly quasiperiodic, too, and 
that differ only slightly from those of the unperturbed system Hq. Most of 
the nonresonant tori of Hq are deformed, but only slightly. Thus, the per- 
turbed system possesses nonresonant tori as well, on which the orbits are 
dense. 



Here, sufficiently irrational means the following. A single frequency is sufficiently 
irrational if there are positive real numbers y and a such that 



n 

u> 

m 



ym 



(2.178a) 



for all integers m and n. Similarly, / rationally independent frequencies are suf- 
ficiently irrational if there are positive constants y and a such that 



^ricoi | > y\r\ " , n e Z . 



(2.178b) 



It is instructive to study the special case of systems with two degrees of freedom, 
/ = 2, because they exhibit many interesting properties that can be analyzed in 

If, in turn, the frequencies are rationally dependent, the tori are said to be resonant tori , cf. 
Example (vi) of Sect. 2.37.2. In this case the motion is quasiperiodic with a number of frequencies 
that is smaller than /. 





168 



2. The Principles of Canonical Mechanics 



detail (see e.g. Guckenheimer, Holmes 1986, Sect. 4.8). The general case is treated, 
e.g., by Thirring (1989) and Riissmann (1979). 

The KAM theorem was a decisive step forward in our understanding of the dy- 
namics of quasiperiodic, Hamiltonian systems. It yields good results on long-term 
stability, although with certain, and somewhat restrictive, assumptions. Therefore, 
the qualitative behavior of only a restricted class of systems can be derived from 
it. An example is provided by the restricted three-body problem sketched above 
(Riissmann 1979). Unfortunately, our planetary system falls outside the range of 
applicability of the theorem. Also, the theorem says nothing about what happens 
when the frequencies {oi, } are not rationally independent, i.e. when there are res- 
onances. We shall return to this question in Sect. 6.5. 



2.40 Examples. The Averaging Principle 

2.40.1 The Anharmonic Oscillator 



Consider a perturbed oscillator in one dimension, the perturbation being propor- 
tional to the fourth power of the coordinate. The Hamiltonian function is 



H = 



P 2 , 1 22 

1 -ma> 0 q ■ 



[xq 



2m 2 
or, in the notation of (2.177), 

„2 j 



(2.179) 



H 0 = 



2m 



. mw\q 2 



H X =q* 



In the absence of the anharmonic perturbation, the energy E l{)) of a periodic orbit 
is related to the maximal amplitude q max by (q max ) 2 = 2E < ' (y> / meofy We take 

def q , def 4 E^ 

x — and s = jx — — j , 

qmax m 2 ox {) 

so that the potential energy becomes 

U(q) — -mco^q 2 + \xq 4 — E {0) ( 1 + sx 2 )x 2 . 

We study this system using two different approaches. 

(i) If we want the perturbed oscillation to have the same maximal amplitude 
qmax, i.e. A'max = 1, the energy must be chosen to be E = E {{)) (\ +e). The aim is 
to compute the period of the perturbed solution to order e. From (1.55) we have 




- 1/2 




2.40 Examples. The Averaging Principle 



169 



In the neighborhood of s — 0 one has 



T(e 


II 

o 

II 

1 ,J 


^ +1 d.r 2n 






-l Vl-.r 2 &>0 






d T 


1 


! r d * + r 1 


.v’d.v j 


3 2n 


de 


£=0 "0 


|i-l Vl — X 2 J- 1 


Vl — X 2 


4 coo 



Thus, the perturbed solution with the same maximal amplitude has the frequency 
co — <yo(l + 3e/4) + 0(e 2 ). It reads 



q(t) ~ q mm sin ((1 + 3e/4)tt> 0 r + <po) . 



(2.180) 



Comparing this with the unperturbed solution q (0 Ht) — qmax sin(<uo? + Wo), we 
see that q(t) is in opposite phase to q ( ^{t) after the time A — 4it/(3scoo). Thus, 
with increasing time, the perturbed solution moves far away from the unperturbed 
one, 

(ii) Let us analyze the same system but this time making use of the methods 
of Sect. 2.38. The action variable (2.168) of the unperturbed oscillator is given by 



7 <°) - _ 



] Mm ax / 

— 2 / d< 7 y' 2 m(£ (0) — mco^q 2 /2) 

,jr J -qmax 

J dxv' 1 — j 



2n J_, 
2 E (0 > 



na>o 



£(°) 

co 0 



and therefore we have 77o(/ (0) ) — E (0 ^ = 7 <0> tt>o. The angle variable (9 ( °) was 
determined in the example of Sect. 2.37.1 (cf. also Sect. 2.24, Example (ii)). We 
have 



q iQ) (t) = q max sin 0 <O) 



with 



qmax 



2/(°> 



mcoo 



and 0 (O) = coot + cpo ■ 



Inserting this into the perturbation yields 



H\ (0 (0) , 7 <0) ) = 



47(0)2 

m 2 coi 



sin 4 6> (0) 



We now calculate the average of sin 4 over the torus T l — .S' 1 : 



n2n 

Jo 



sin 4 V 0) dV°> = -2 tt = — . 

8 4 



The average of 77j (2.175) is then (Hi) — 37 (0)2 /2w 2 &>q. Inserting this into (2.176) 
we get 
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d 3/(0) 

O' i^(t) = 0. (2.181) 

m~a>Q 

To first order in the parameter //, which measures the strength of the perturbation, 
we obtain according to (2.173) 

1 3/x/ (0) / 3 \ 

-6(t) ~ tt> 0 H -j- = CD 0 1 + -e) , 

t m z u>Q \ 4 / 

I(T) ~ 7 (0) , (2.182) 

with e as defined above. Clearly (2.182) is precisely our earlier result (2.180): the 
frequency increases a little, but the action variable stays constant. 



2.40.2 Averaging of Perturbations 



The result (2.176) for the motion in first-order perturbation theory contains the 
average of H i over the torus (2.175). This average is the same as the time average 
(if the frequencies are rationally independent). This is a special case of a more 
general situation that may be described as follows. For the sake of simplicity we 
consider the case f — 1. The unperturbed system has the period To — 2n / mo- 
Take t to be a time large compared to Tq , but still small compared to A ~ To/fJi, 
where /z again measures the strength of the perturbation. It is instructive to consider 
the example of Sect. 2.40.1, where A — 47t/(3£&>o) is the time after which the 
perturbed system is completely out of phase. Taking, for example, the solution 
(2.180) with (po = 0, we have 



2n 



To 



2jt 



q(t) — sin I — t ) cos I — t ) + cos I — t 1 sin 



A 



( 2 71 



V To 



2jt 

~A' 



which, for To < t A, is approximately 



q(t) — sin 




2n 

— t cos 
A 





3 {2tx \ 



Thus, the unperturbed solution is modified by a small term that is the product of a 
term proportional to 1 and of cos(2tz7/ 7'o), the latter being of comparatively rapid 
oscillation. During the same time the action variable does not change, or changes 
only to second order in the perturbation. 

More generally, if the equations 0 <O) = 0 (O) (7 (O) ), / (0) = 0 are subject to a 
perturbation such that the perturbed equations of motion read 

0 = <9 (0) (/ <0) ) + /z/(0, 1) , 

/ = ng(0, 1) , (2.183) 

where / and g are periodic functions in 9, then the change of the action variable 
over time t will be approximately 
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SI — \it 



- [ / (0) )dr' 

t Jo 



As t > Tq, the term in curly brackets is approximately the time average, taken over 
the unperturbed motion. Here, this is equal to the average over the torus T 1 . There- 
fore, one expects the average behavior of I (t) to be described by the differential 
equation 



I = Mg) = r d 6g(6, 1) . 

2tt Jo 

Returning to the special case of a Hamiltonian perturbation, 
H — Hod) + nH\{9, 1) , 



(2.184) 



the second equation (2.183) reads 
9 Hi 



I — ~n 



86 



As H i was assumed to be periodic, the average of 9 H\/ 86 over the torus vanishes 
and we obtain the averaged equation (2.184), 




in agreement with the results of perturbation theory. These results tell us that the 
action variable does not change over time intervals of the order of t , with To < t < 
A. Dynamical quantities that have this property are said to be adiabatic invariants. 
The characteristic time interval that enters the definition of such invariants is t, with 
t < A ~ To/ [i. Therefore, it is meaningful to make the replacement // = r]t in the 
perturbed system H(6 , /, //). For times 0 < t < 1 /rj the system changes slowly, 
or adiabatically. A dynamical quantity F{6, 1 , fj.) : P — >• K is called adiabatic 
invariant if for every positive constant c there is an rio such that for rj < t]o and 
0 < t < \/r) 

| F{6{t), I_{t), rjt) - F{6{ 0), 7(0), 0)| < c (2.185) 

(see e.g. Arnol’d 1978, 1983). 

Note that the perturbation on the right-hand side of (2.183) need not be Hamil- 
tonian. Thus, we can also study more general dynamical systems of the form 

x — F-fix, t, f) , (2.186) 

where x e P, 0 < f « 1, and where / is periodic with period To in the time 
variable t. Defining 

rlpf 1 f T ° 

{/)(¥) = — f(x,t,0)dt, 

1 0 Jo 
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we may decompose / into its average and an oscillatory part, 
f = {f)(x) + 8(x,t,fi) . 

Substituting 

x — y + fis(y, t, /r) 



and taking the differential with respect to t gives 



dy/c 



vyk , v - ' ^Sk dyt I o , u °k \ vyi . 

dt a,,, a t ~ ^ a,,. ) a f — ^ /X a. 



3t 



3y; 3f 



35A dy, 



dS k 



dyt ) 3 1 



3 1 



= /x{fk)(x) + txgkix, t, ii) - . 

dt 

If S is chosen such that dS^/dt — gk(y ,t, 0), and if terms of higher than first 
order in // are neglected, then (2.186) becomes the average, autonomous system 
y = , (2.187) 



(see Guckenheimer, Holmes 1986, Sect. 4.1). Let us return once more to the ex- 
ample of Sect. 2.40.1. In the first approach we had asked for that solution of the 
perturbed system which had the same maximal amplitude as the unperturbed one. 
Now we have learnt to “switch on” the perturbation, in a time-dependent fash- 
ion, by letting /x = r\t increase slowly (adiabatically) from t — 0 to t. Our result 
tells us that the adiabatically increasing perturbation deforms the solution with 
energy E (i)) and amplitude q max smoothly into the perturbed solution with energy 
E — A 0 *)! + e) that has the same amplitude as the unperturbed one. 

A final word of caution is in order. The effects of small perturbations are by 
no means always smooth and adiabatic - in contrast to what the simple examples 
above seem to suggest. For example, if the time dependence of the perturbation 
is in resonance with one of the frequencies of the unperturbed system, then even 
a small perturbing term will upset the system dramatically. 



2.41 Generalized Theorem of Noether 

In the original form of Noether’s theorem, Sect. 2.19, the Lagrangian function 
was assumed to be strictly invariant under continuous transformations containing 
the identity. Invariance ofL(gi, ... ,qy,q i, . ... qf) with respect to one-parameter 
symmetry transformations of the variables q ; implied that the equations of motion 
were covariant, that is were form invariant under such transformations. This is one 
of the reasons why the notion of Lagrangian function is of central importance: 
in many situations it is far simpler to construct invariants rather than covariant 
differential equations. The theorem in its strict form was illustrated by the closed 
n -particle system with central forces. It was shown that 
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- its invariance under space translations yielded conservation of total momentum, 

- its invariance under rotations in R 3 , yielded conservation of total angular mo- 
mentum. 

- By extending the definition of independent generalized coordinates it was also 
possible to demonstrate the relationship between invariance under translations 
in time and conservation of energy, s. Exercise 2.17 and its solution. 

Throughout this section, for simplicity, we drop the “under-tilda” on points of 
velocity space or phase space and write q for (q i, ... , qq), q for (q\, . . . ,qp), 
and p for (pi, .... pp ). 

In this section we discuss further versions of Noether’s theorem which gen- 
eralize the previous case in two respects. First we recall that covariance of the 
equations of motion is also guaranteed if the Lagrangian function is not strictly 
invariant but is modified by an additive time differential of a function of q, and t, 

. d 

L(q, q, t) L (q, q, t) = L(q, q, t ) H M(t, q) . (2.188) 

dr 

The function M should be a smooth function (or, at least, a C 2 function) of the 
coordinates q/ and possibly time but otherwise is arbitrary. As an example consider 
the Lagrangian function of n freely moving particles, 



L(x 1 , . . . , X n , X 1 , . . . , X n ) — 7ki n 



■ 

1=1 



Obviously, L is invariant under arbitrary Galilei transformations, the corresponding 
Euler-Lagrange equations are covariant, i.e. if one of the two following equations 
holds then also the other holds. 



d 2 *; (?) _ ^ d 2 x\(t') 

dr 2 - U ^ dr ' 2 



i — 1,2, ... ,n . 



Thus, a general Galilei transformation (1.32) (barring time reversal and space re- 
flections) 

t i->- t' = t + s , sel, (2.189a) 

x m* x' = Rj: + wt + a , Re SO(3) , w, a real, (2.189b) 

does not change the form of the equations of motion. However, it does change the 
Lagrangian function, viz. 

n i n 

L'ix'^x') — L(Xi,Xi) + ^2 ,n i (R*/) • w + - ^2 m i w 2 
i= 1 i= 1 

n i n 

— L(Xi,ki) + y^ mjXj ■ (R _ 1 u)^ + - y" ntj w 2 . 
i= 1 i= 1 

The new function L' differs from L by the time differential of the function 
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n n 

M(xi, . . . , X n , t) = ^2 m i x i ' (R 1m; ) + ??(y ^, m i)w 2 . 

i= 1 i = 1 

This is seen to be a gauge transformation in the sense of Sect. 2.10 which leaves 
the equations of motion unchanged. The example shows that Noether’s theorem 
can be extended to cases where the Lagrangian function is modified by the gauge 
terms introduced in Sect. 2.10. 

A further generalization of the theorem consists in admitting gauge functions 
in (2.188) which depend on the variables t, qi, as well as on q provided a supple- 
mentary condition is introduced which guarantees that any new acceleration terms 
iji caused by the symmetry transformation vanish identically 12 . 

Given a mechanical system with / degrees of freedom, to which one can asso- 
ciate a Lagrangian function L{q, q , t) and coordinates (t, qi, . . . , qq, q\, . . . , q /) 
on K, x P (direct product of time axis and phase space), consider transformations 
of the coordinates 

t' — g(t,q,q,s) , (2.190a) 

q ri = h‘(t,q,q,s) . (2.190b) 

The functions g and h l should be (at least) twice differentiable in their 2 / + 2 
arguments. The real parameter s varies within an interval that includes zero, and 
for s — 0 (2.190a) and (2.190b) are the identity transformations 

g(t, q, q, s = 0) = t , h‘ (t, q, q, s = 0) = q l , * = 1,2, ...,/. 



As for the case of strict invariance of the Lagrangian function only the neighbour- 
hood of .? = 0 matters for our purposes. This means that g and h l may be expanded 
up to first order in .v, 



St := t’ -t — — 
ds 



s + 0(s 2 ) = r(L q , q)s + 0(s 2 ) , 



s=0 



Sq' := q" -q’ = 



s + Ois 2 ) = k' ( t , q, q)s + 0(s~) , 



(2.191a) 

(2.191b) 



s=0 



terms of order ,v 2 and higher being neglected. The first derivatives defined in 
(2.191a) and in (2.191b), 



r(f, q, q) = 



dg(t, q, q, s) 
8s 



« l (t,q,q ) = 



j=0 



dll' (t, q, q, s) 
8s 



s=0 



are the generators for infinitesimal transformations g and h‘ . 

An arbitrary smooth curve t -> q(t) is mapped by g and h l to a curve t' — > 
q'(t'). To first order in s, their time derivatives fulfill the relation 



d q" d q" dr 

"dr 7 _ ~dt~d? 



q‘ + sk 1 
1 + si 



— q' + s(k l — q'r) . 



(2.192) 



12 



W. Sarlet, F. Cantrijn; SIAM Review 23 (1981) 467. 
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The action functional on which Hamilton’s principle rests, stays invariant, up to 
gauge terms, if there is a function M(r, q, q) such that 



j\t' L(q'(t'),^q'(t'),t') 

= *■(*<»• £««>. ') + 7, im 'to' q) 



0(s 2 ) 



for every smooth curve t -> q{t). The integral on the left-hand side can be trans- 
formed to an integral over t from t\ to tx. 




Then, for every smooth curve we must have 



L (q\t\—q\t\t’) 
= L(q(t), f) + 



dr' 
d t 



d M(t, q , q) 



dt 



(2.193a) 



this being an identity in the variables t, q, and q. To first order in s and with 

d M(t, q , q) 



= 1 + si this equation yields 



dL ^ 3L , ^ 3L 

97 i ' + Ea ? s ? +T.^! S 1 +* L «- »•«>* = ■ 



dr 



. (2.193b) 



What we have to do next is to insert (2.192) and to calculate the total time deriva- 
tives f, k‘ , and M(r, q, q), obtaining 



Sq 1 = s(k' — q'x) 
r 



. 3t ^ \ 3r ■ ^ ^ dx • 

r_ 37 + ^-'iv' / + ^-'34 7<? ’ 



• 3 k ' 3at' 3rc' 



3 M ^3M., dM ■ 

3 r ^ dq' 1 ^ dq l 1 



Collecting all terms in (2.193b) which are proportional to s, inserting the auxiliary 
formulae just given, comparison of coefficients in (2.193b) gives the somewhat 
lengthy expression 
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(2.193c) 



At this point one imposes the condition on the terms containing accelerations q J 
that was formulated above. This yields a first set of / equations. Indeed, collecting 
all such terms for every value of the index j, one obtains 



L(t , q, q) 



dr 

3 qj 



3 L / 3 k' .j dr \ 

“ 3 q‘ V 3 qj ^ 3 qj ) 






3 M 
dqi 



(2.194a) 



If these equations are fulfilled the lengthy equation (2.193c) reduces to one further 
equation that will be important for identifying integrals of the motion. It reads 



3 L 3L , x - 3 L 

— r + > — tk ' + > — r 

3 1 ^ 3 q‘ ^ 3 q' 




. / 3r 3r . 3 M x ^ 3 M 



(2.194b) 



Thus, one obtains in total (/ + 1) equations which hold for arbitrary smooth curves 
t q(t). These equations simplify when q(t) = (pit) is a solution of the Euler- 
Lagrange equations for L, 



d 3 L 
d t 3 q' 



3 L 

— r = 0 , i = 1,2,...,/ 

3 q‘ 7 



q(t) = vit) . 



The strategy aiming at uncovering integrals of the motion is the following: 
Write eq. (2.193c), as far as possible, as a sum of terms which contain only total 
time derivatives, and make use of the equations of motion, to replace where ever 
this is necessary, J^)- by ^(|^-). Note that those expressions in (2.193c) that are 
contained in round brackets are already in the form of total time derivatives. Only 
the first two terms on the left-hand side still contain partial derivatives. Repeating 
(2.193c), inserting the equations of motion in the second term, it becomes 




37. 

~dt 
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T + 



U 



d 37, 
d t 3 q' 




dL d/c' 
34' dr 



E d L , dr dr 

— r q l h Z. 

3^' J dr dr 



AM 

= 0 . 

dr 



(2.195a) 



The sum of the second and third terms of this equation is a total differential. Col- 
lecting the first and the fourth terms, and making use once more of the equations 
of motion, one has 



3 L 3 L • dr 

3 r 3 q' dr 

= — r - V( - — y r - T — (-4')r - T — 4'- 

dr -^-Adr 34' / ^ 34' Vdr / 4^ 34' 1 dr 

= — r - - VY— 4'r) • 
dr At ^\dq' / 

l 



(2.195b) 



Inserting this in (2.195a) transforms this equation into one that contains indeed 
only total derivatives with respect to t, 



d 

dr 



(Lr) 





d 

— M = 0 . 
dr 



(2.195c) 



This shows that the dynamical quantity / : R, x fg -> K, 



/(r, q , 4) = L(t, q , q)r(t , < 7 , q) 

+ XI ^p-[y (*> 4) - <h q) | - ^(r, 9- 4) 



(2.196) 



is constant when taken along solutions q(t) — (pit) of the equations of motions. 
We call 7, eq. (2.196), the Noether invariant. 

The following examples serve the purpose of illustrating the nature of the 
Noether invariant and its relationship to the symmetries of the mechanical sys- 
tem described by the Lagrangian function. 

Example (i) If the generating function r as well as the gauge function M vanish 
identically, 



r(t,q,q) = 0 , M(t.q.q) = 0, (2.197) 

then one is back to the case of strict invariance. Sect. 2.19. The invariant (2.196) 
then is identical with the expression (2.56) for which several examples were given 
in Sect. 2.19. 

Example (ii) Consider the closed n -particle system described by the Lagrangian 
function 
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L — ^ jrm k x (k)2 - U(x (l) , . (2.198) 

z k= l 

Obviously, this function is invariant under translations in time. By choosing, ac- 
cordingly, 

r(t,q,q) = — 1, K l (t,q,q) = 0, M(t,q,q) — 0 (2.199) 

the Noether invariant (2.196) is found to be 

/ = -L + V q 1 = - ( r kin - U) + 2T kin = r kin + u = E . (2.200) 

‘— J dq l 

Thus, invariance under time translations implies conservation of the total energy. 

Example (iii) For the same system (2.198) choose the generating functions and 
the gauge function as follows, 

n 

r(t,q,q) = 0, K {k)l (t, q, q) = t , M(t,q,q ) = , (2.201) 

k= 1 

with k numbering the particles from 1 to n. The number of degrees of freedom 
being / = 3n the functions k’ are numbered by that index k and the three cartesian 
directions. Inserting (2.201) into (2.196) the Noether invariant is found to be 

n n 

I = tJ2 m k x m - m k x m . (2.202) 

k= 1 k= 1 

This is seen to be the 1 -component of the linear combination 

n 

tMv$ — Mrs(t) = tP — Mrs(t ) , (M — ^ m k ) 

k= 1 

of the center-of-mass’s momentum P and of its orbit r s(t) and is equal to the 
1 -component of Mr( 0). This is the center-of-mass principle obtained earlier in 
Sect. 1.12. 

Remarks: 

1. There are more examples for the use of the generalized version of Noether’s 
theorem which apply to specific forms of the interaction. For instance, in the case 
of the Kepler problem with its characteristic 1/r-potential, one can derive the con- 
servation of the Hermann-Bernoulli-Laplace vector (usually called Lenz-Runge 
vector) (see also exercise 2.22 and its solution). This example is worked out in 
Boccaletti and Pucacco (1998). 

2. The theorem of E. Noether has a converse in the following sense. Taking 
the derivative of the function / (q . q , f), eq. (2.196), with respect to q J and using 
the equations (2.194a) and (2.194b) one sees that 
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a/ _ v 



d 2 L 
dqi d q 



nk' 



q k r) 



The matrix of second, mixed partial derivatives that multiplies the right hand side. 




is well-known from the Legendre transformation from L to H . Assume its deter- 
minant to be different from zero, 



D = det A / 0 , 



(which is the condition for the Legendre transformation to exist!) so that A pos- 
sesses an inverse. Denoting the entries of the inverse by 

A -1 = {A kl ) , i.e. J2 A jkA k ‘ = S l j , 

k 



the initial equation can be solved for K k , 
K k (t, q, q) = Y A k, ^j + q k T(t, q, q) . 

, 3 q‘ 



(2.203a) 



Inserting this expression in (2.196) and solving for r one obtains 



r(t, q, q) 



1 

L 



I(t , q, q) + M(t, q , q) - A kl 

I 



9 / dL 
dq 1 dq k 



(2.203b) 



Thus, to every integral of the motion / (q, q,t) of the dynamical system described 
by the Lagrangian function L(q . ij . t) there correspond the infinitesimal transfor- 
mations (2.203a) and (2.203b). For all solutions t -> <p{t) of the equations of 
motion these generating functions leave Hamilton’s action integral invariant. 

Note, however, that M{q, q, t ), to a large extent, is an arbitrary function and 
that, as a consequence, the function r(q. q . t ) ist not unique. For a given integral 
of the motion there are infinitely many symmetry transformations. 

3. There is a corollary to the statement given in the previous remark. Given 
an integral I(q,q,t) — I ({y> (q , q,t ) of the motion for the mechanical system de- 
scribed by L(q, q , t), an integral that corresponds to the transformation generated 
by 



r (0) = 0 , k‘ = K m {t,q,q) , with M = M (0) (t, q, q) . 



Then the following transformations 

r = r(f, q, q) , k 1 = /r (0) '(/, q , q) + r(f, q, q)q' , 



together with the choice 

M = q , q) + L(t, q, q)r(t, q , q) 



lead to the same integral of the motion. The verification of this is left as an exercise. 
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Appendix: Practical Examples 



1. Small Oscillations. Let a Lagrangian system be described in terms of / gen- 
eralized coordinates { qt }, each of which can oscillate around an equilibrium posi- 
tion q 9. The potential energy U(q i, ... ,qf) having an absolute minimum Uo at 
(cjf, . . . , q { j ) one may visualize this system as a lattice defined by the equilibrium 
positions (qf , . . . , q the edges of which can oscillate around this configuration. 
The limit of small oscillations is realized if the potential energy can be approxi- 
mated by a quadratic form in the neighborhood of its minimum, viz. 

1 f 

U(qi,..., q f ) ~ - ^ Uikiqi - qf)(qk ~ 4°) ■ (A.l) 

Note that for the mathematical pendulum (which has f — 1) this is identical with 
the limit of small deviations from the vertical, i.e. the limit of harmonic oscillation. 
For / > 1 this is a system of coupled harmonic oscillators. 

Derive the equations of motion and find the normal modes of this system. 

Solution. It is clear that only the symmetric part of the coefficients m,/. is dynam- 

dcf 

ically relevant, ajk — (ujk + Uki)/ 2. As U has a minimum, the matrix 

A = 



is not only real and symmetric but also positive. This means that all its eigenvalues 
are real and positive-semidefinite. It is useful to replace the variables qi by the 
deviations from equilibrium, Zi = qi — qf. The kinetic energy is a quadratic form 
of the time derivatives of q, or, equivalently of Zi, with symmetric coefficients: 



1 ^ — \ . . 

T = ~ 2 ^ tikZiZk ■ 



i,k 



The matrix { r z - * } is not singular and is positive as well. Therefore, one can choose 
the natural form for the Lagrangian function 



L 



1 

2 



^ ^ tikZiZk QikZiZk) . 
i,k 



(A. 2) 



from which follows the system of coupled equations 

/ / 

tik'ik + ^ a U z i — 0 ’ 1 = 1,...,/. (A. 3) 

k=i j = t 

For f — 1 this is the equation of the harmonic oscillator. This suggests solving 
the general case by means of the substitution 
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Zi = di e 



i £2t 



The complex form is chosen in order to simplify the calculations. In the end we 
shall have to take the real part of the eigenmodes. Inserting this expression for z 
into the equations of motion (A.3) yields the following system of coupled linear 
equations: 

/ 

+ aij)a/ = 0 . (A. 4) 

j = 1 

This has a nontrivial solutions if and only if the determinant of its coefficient 
vanishes, 

det(c; ( y - n 2 t lj ) = 0 . (A.5) 

This equation has / positive-semidefinite solutions 



which are said to be the eigenfrequencies of the system. 

As an example we consider two identical harmonic oscillators (frequency &>o) 
that are coupled by means of a harmonic spring. The spring is not active when 
both oscillators are at rest (or, more generally, whenever the difference of their 
positions is the same as at rest). It is not difficult to guess the eigenfrequencies 
of this system: (i) the two oscillators swing in phase, the spring remains inactive; 
(ii) the oscillators swing in opposite phase. Let us verify this behavior within the 
general analysis. We have 

T = \m{z\ + z\) , 

U = \mu>^z\ + zi) + \mco\(zi - zi) 2 ■ 

Taking out the common factor m, the system (A. 4) reads 



( («o + <°t) “ q2 

V -«? 



(&>q — a > 2 ) — i 2 ‘ 




(A.40 



The condition (A.5) yields a quadratic equation whose solutions are 
£ 2 \ = a>5 , £?f = &>q + 2a>j . 



Inserting these, one by one, into the system of equations (A.4'), one finds 
for £2\ = a>o , = a[ l) , 

for + 2 co 2 , a {2) = — flj”* . 

(The normalization is free. We choose a\ n — l/\/2, ; = 1 , 2). Thus, we indeed 
obtain the expected solutions. The linear combinations above, i.e. 
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Q\ = Vw E a- l) Zi = (zi + Z2)V m / 2 > 

i 

Q 2 = f yfm ^2 a ( j 2> Zi = (zi - Z2)yJm/2 

i 

decouple the system completely. The Lagrangian function becomes 

L= l -j2(Qt-n?Qj). 

L 1=1 

It describes two independent linear oscillators. The new variables Q, are said to 
be normal coordinates of the system. They are defined by the eigenvectors of the 
matrix (a,j — fZftij) and correspond to the eigenvalues Qf. 

In the general case (/ > 2) one proceeds in an analogous fashion. Determine 
the frequencies from (A. 5) and insert them, one by one, into (A.4). Solve this 
system and determine the eigenvectors (af \ . . . , a^) (up to normalization) that 
pertain to the eigenvalues £2f. 

If all eigenvalues are different, the eigenvectors are uniquely determined up to 
normalization. We write (A.4) for two different eigenvalues, 

E^E + aij)af = 0 , (A. 6a) 

j 

= 0, (A. 6b) 

i 

and multiply the first equation by a) p) from the left, the second by a [ - ,] from the 

l J 

left. We sum the first over i and the second over j and take their difference. Both 
tij and a, j are symmetric. Therefore, we obtain 

(«p-«, 2 )E fl / p V? ) = o- 

ij 

As Q 2 ^ Q 2 , the double sum must vanish if p ^ q. For p = q, we can normalize 
the eigenvectors such that the double sum gives 1. We conclude that 

X>,«Vf = a„. 

u 

Equation (A. 6a) and the result above can be combined to obtain 



J2 a< i P) aij a f = n 2 p J2 a i 



(p) r . ( <?) _ o2x 

/ ‘ij a j ~ ^• i p 0 pq 



This result tells us that the matrices tjj and « (/ are diagonalized simultaneously. 
We then set 
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zi = Y. a \ P) Qp ( a . 7 ) 

p 

and insert this into the Lagrangian function to obtain 

L= l -Y J {Q 2 P -n 2 P Q 1 P ). (A. 8) 

p= i 



Thus, we have achieved the transformation to normal coordinates. 

If some of the frequencies are degenerate, the corresponding eigenvectors are 
no longer uniquely determined. It is always possible, however, to choose s linearly 
independent vectors in the subspace that belongs to Q r] — Q,- 2 = ... — Q rs ( s 
denotes the degree of degeneracy). This construction is given in courses on linear 
algebra. 

One can go further and try several examples on a PC: a linear chain of n oscil- 
lators with harmonic couplings, a planar lattice of mass points joined by harmonic 
springs, etc., for which the matrices tjk and are easily constructed. If one has at 
one’s disposal routines for matrix calculations, it is not difficult to find the eigen- 
frequencies and the normal coordinates. 

2. The Planar Mathematical Pendulum and Liouville’s Theorem. Work out 
(numerically) Example (ii) of Sect. 2.30 and illustrate it with some figures. 

Solution. We follow the notation of Sect. 1.17.2, i.e. we take zi = q> as the gener- 
alized coordinate and zi — <i > /&> as the generalized momentum, where co = g/ l 
is the frequency of the corresponding harmonic oscillator and r — cat. Thus, time 
is measured in units of (<a) -1 , The energy is measured in units of mgl, i.e. 



E 

mgl 



i o 

-zj + (1 - COSZl) ■ 



(A. 9) 



e is positive-semidefinite. e < 2 pertains to the oscillating solutions, s — 2 is the 
separatrix, and s > 2 pertains to the rotating solutions. The equations of motion 
(1.40) yield the second-order differential equation for z.\ 



d 2 



z 1 



dr 2 



sinzi(r) . 



(A. 10) 



First, one verifies that zi and zi are indeed conjugate variables, provided one uses 
r as time variable. In order to see this start from the dimensionless Lagrangian 
function 



X — ^ l 
mgl 



1 /d<A 2 
2 co 2 \ dt / 



(1 — cos cp) 



= l(^V-n- 



2 V d r / 



(1 - COSZl) 



and take its derivative with respect to zi = (dzi/dr). This gives zi — (dzi/dr) = 
(dap / At) /ca, as expected. 
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For drawing the phase portraits, Fig. 1.10, it is sufficient to plot zi as a function 
of zi, as obtained from (A. 9). This is not sufficient, however, if we wish to follow 
the motion along the phase curves, as a function of time. As we wish to study the 
time evolution of an ensemble of initial conditions, we must integrate the differ- 
ential equation (A. 10). This integration can be done numerically, e.g. by means of 
a Runge-Kutta procedure (cf. Abramowitz, Stegun 1965, Sect. 25.5.22). Equation 
(A. 9) has the form y" — — sin y. Let h be the step size and y n and y' n the values 
of the function and its derivative respectively at r„. Their values at r„+i = r„ + h 
are obtained by the following series of steps. Let 




Then 

y n + 1 = yn + h[y' n + \ {k\ + 2k 2 )] + O (h 4 ) , 

y'n+ 1 = y'n + \h + ]k 2 + + 0(/7 4 ) . (A. 12) 

Note that y is our zt while y' is z 2 and that the two are related by (A. 9) to the 
reduced energy s. Equations (A. 12) are easy to implement on a computer. Choose 
an initial configuration (yo = zi(0), y' 0 — Z 2 (0)), take h = 7r/30, for example, 
and run the program until the time variable has reached a given endpoint r. Using 
the dimensionless variable r, the harmonic oscillator (corresponding to small os- 
cillations of the pendulum) has the period 7’ (0) = 2n. It is convenient, therefore, 
to choose the end point to be 7’ ,0i or fractions thereof. This shows very clearly 
the retardation of the pendulum motion as compared to the oscillator: points on 
pendulum phase portraits with 0 < s « 2 move almost as fast as points on the os- 
cillator portrait; the closer s approaches 2 from below, the more they are retarded 
compared to the oscillator. Points on the separatrix (s = 2) that start from, say, 
("1 — 0, z 2 — 2) can never move beyond the first quadrant of the (zi ■ Z 2 )-plane. 
They approach the point (jr, 0) asymptotically, as r goes to infinity. 

In the examples shown in Figs. 2.13-15 we study the flow of an initial ensemble 
of 32 points on a circle with radius r — 0.5 and the center of that circle, for the 
time intervals indicated in the figures. This allows one to follow the motion of 
each individual point. As an example, in Fig. 2. 14 we have marked with arrows 
the consecutive positions of the point that started from the configuration (0, 1). 

Of course, one may try other shapes for the initial ensemble (instead of the 
circle) and follow its flow through phase space. A good test of the program is to 
replace the right-hand side of (A. 10) with — zi- This should give the picture shown 
in Fig. 2.12. 
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The theory of rigid bodies is a particularly important part of general mechanics. 
Firstly, next to the spherically symmetric mass distributions that we studied in 
Sect. 1.30, the top is the simplest example of a body with finite extension. Sec- 
ondly, its dynamics is a particularly beautiful model case to which one can apply 
the general principles of canonical mechanics and where one can study the con- 
sequences of the various space symmetries in an especially transparent manner. 
Thirdly, its equations of motion (Euler’s equations) provide an interesting exam- 
ple of nonlinear dynamics. Fourthly, the description of the rigid body leads again 
to the compact Lie group SO(3) that we studied in connection with the invari- 
ance of equations of motion with respect to rotations. The configuration space of 
a nondegenerate top is the direct product of the three-dimensional space R 3 and of 
the group SO(3), in the following sense. The momentary configuration of a rigid 
body is determined if we know (i) the position of its center of mass, and (ii) the 
orientation of the body relative to a given inertial system. The center of mass is 
described by a position vector r$(t) in R 3 , the orientation is described by three, 
time-dependent angles which span the parameter manifold of SO(3). 

Finally, there are special cases of the theory of rigid bodies which can be in- 
tegrated analytically, or can be analyzed by geometrical means. Thus, one meets 
further nontrivial examples of integrable systems. 



3.1 Definition of Rigid Body 

A rigid body can be visualized in two ways: 

(A) A system of n mass points, with masses mi, . . . , m n , which are joined by 
rigid links, is a rigid body. Figure 3.1 shows the example n = 4. 

(B) A body with a given continuous mass distribution g(r) whose shape does 
not change, is also a rigid body. The hatched volume shown in Fig. 3.2 is an 
example. 

In case (A) the total mass is given by 

n 

M = m,- 

i = 1 



(3.1) 
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Fig. 3.1. A finite number of mass points whose Fig. 3.2. A rigid body consisting of a fixed, 

distances are fixed at all times form a rigid invariable mass distribution 

body. The figure shows the example n = 4 



while for case (B) it is 

M — J d 3 rQ(r) , (3.2) 

(cf. Sect. 1.30). 

The two definitions lead to the same type of mechanical system. This depends 
in an essential way on the assumption that the body has no internal degrees of 
freedom whatsoever. If the distribution g(r) of case (B) is allowed to be deformed, 
there will be internal forces. This is the subject of the mechanics of continua. One 
can easily imagine that the dynamics of an extended object with continuous mass 
distribution is quite different from that of the system shown in Fig. 3.1 when the 
object is not rigid. 

It is useful to introduce two classes of coordinate system for the description 
of rigid bodies and their motion: 

(i) a coordinate system K that is fixed in space and is assumed to be an inertial 
system; 

(ii) an intrinsic (or body-fixed) coordinate system K which is fixed in the body 
and therefore follows its motion actively. 

Figure 3.3 shows examples of these two types of reference system. The inertial 
system K (which we may also call the observer’s or “laboratory” system) is use- 
ful for a simple description of the motion. The intrinsic, body-fixed system K in 
general is not an inertial system because its origin follows the motion of the body 
as a whole. It is useful because, with respect to this system, the mass distribution 
and all static properties derived from it are described in the most simple way. Take 
for example the mass density. If looked at from K, g(r ) is a given function, fixed 
once and for ever, irrespective of the motion of the body. With respect to K, on 
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the other hand, it is a time-dependent function g(r,t) that depends on how the 
body moves in space. (For an example see Exercise 3.9.) 

The origin S of K is an arbitrary but fixed point in the body; (it will often be 
useful to choose the center of mass for S). Let i's(t) be the position vector of S 
with respect to the inertial system K. Another point P of the body has position 
vector r{t) with respect to K, and jr with respect to K. As it describes P relative 
to S, Jt: is independent of time, by construction. 

The number of degrees of freedom of a rigid body can be read off Fig. 3.3. 
Its position in space is completely determined by the following data: the position 
rs(f) of S and the orientation of the intrinsic system K with respect to another 
system centered on S whose axes are parallel to those of K. For this we need six 
quantities: the three components of r s, as well as three angles that fix the relative 
orientation of K. Therefore, a nondegenerate rigid body has six degrees of freedom. 

(The degenerate case of the rod is an exception. The rod is a rigid body whose 
mass points all lie on a line. It has only five degrees of freedom.) It is essential 
to distinguish carefully the (space-fixed) inertial system K from the (body-fixed) 
system K. Once one has understood the difference between these two reference 
systems and the role they play in the description of the rigid body, the theory of 
the top becomes simple and clear. 



3.2 Infinitesimal Displacement of a Rigid Body 

If we shift and rotate the rigid body infinitesimally, a point P of the body is dis- 
placed as follows: 

dr = drs fd^xr, (3.3) 

where we have used the notation of Fig. 3.3. The displacement drs of the point 
S is the parallel shift of the body as a whole. The direction h = dap/\&(p\ and 
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the angle \&<p\ characterize the rotation of the body, for a fixed position of S. 
The translational part of (3.3) is immediately clear. The second term, which is 
due to the rotation, follows from (2.68) of Sect. 2.21 and takes account of the fact 
that here we are dealing with an active rotation, while the rotation discussed in 
Sect. 2.21 was a passive one - hence the difference in sign. Alternatively, the action 
of this infinitesimal rotation can also be understood from Fig. 3.4. We have | djc | = 
x • d(/j sin a, (h, x, dx) forming a right-hand system. Therefore, as claimed in 
(3.3), dx = d<p x x. 




Fig. 3.4. Drawing of the action of a small rotation of the rigid body 
and from which relation (3.3) can be read off 



From (3.3) follows an important relation between the velocities of the points 
P and S', 



def dr def dr s 

v — — and V — , 

dr dr 

respectively, and the angular velocity 

def 

= —r~ ■ 

dr 

It reads 



(3.4a) 



(3.4b) 



i)=V + axx. (3.5) 

Thus, the velocity of P is the sum of the translation velocity of the body as a whole 
and of a term linear in the angular velocity a>. We now show that this angular 
velocity is universal in the sense that it characterizes the rotational motion of the 
body but does not depend on the choice of S, the origin of K. In order to see 
this, choose another point S' with coordinate r' s — r$ + a. The relation (3.5) also 
applies to this choice, 

11 = V' + b'xx'. 

On the other hand we have r = r' s + x' — rs + a + x', and hence x = x' + a 
and v=V + (oxa + a>xx'. These two expressions for the same velocity hold 
for any x orx'. From this we conclude that 




V' = V + co x a , 
(o' = CO . 
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(3.6a) 

(3.6b) 



This shows the universality of the angular velocity. 



3.3 Kinetic Energy and the Inertia Tensor 

From now on we place S, the origin of the intrinsic system K, at the center of mass 
of the body. (Exceptions to this will be mentioned explicitly.) With the definition 
(1.29) of the center of mass, this implies in case (A) that 

n 

^2 m i x{l) — 0 (3.7a) 

;=i 

and in case (B) that 

J d 3 .rjc£(;c) = 0 . (3.7b) 

We calculate the kinetic energy for both cases (A) and (B), for the sake of 
illustration. 

(i) In the discrete model of the rigid body and making use of (3.5) we find 

T — - m, t/ ;) = -^2 m i(V + (D x x (i) y 
i= 1 

= - V 2 + V ■ ^ m, (w x x (!) ) + - y~^ m,- (co x . (3.8) 

In the second term of this expression one may use the identity 
V ■ (&> x x (i) ) = x {i) -(V xu>) 
to obtain 



V ■ x x u) ) — (V x u>) ■ y ^ ntjX^ — 0 . 

This term vanishes because of the condition (3.7a). 

The third term on the right-hand side of (3.8) contains the square of the vector 
cox x (l K Omitting for a moment the particle index, we can transform this as follows: 

(o> x x) 2 — urx 2 sin 2 a — co 2 x 2 ( 1 — cos 2 a) 

3 3 

= « 2 jc 2 — (co ■ x) 2 = £ w IJ '(x 2 Si XV — XnX v S jco v . 

fi = 1 y=l 

The decomposition of this last expression in Cartesian coordinates serves the 
purpose of separating the coordinates x (l> from the components of the angular 
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velocity u>. The former scan the rigid body, while the latter are universal and hence 
independent of the body. Inserting these auxiliary results into (3.8) we obtain a 
simple form for the kinetic energy, 

3 3 

T = l -MV 2 + ^hvCo v (3.9) 

fl= 1 V= 1 

where we have set 

(3.10a) 

(ii) The calculation is completely analogous for the continuous model of the rigid 
body, 

T — - J d 3 jce(Jt)(V + (o x jc) 2 
— -V 2 J d 3 jrg(jc) + {V x (o) J d 3 xg(jr)x 

+ ^ J d 3 xq(x)co^ o) v . 

The integral in the first term is the total mass. The second term vanishes because 
of the condition (3.7b). Thus, the kinetic energy takes the same form (3.9), with 
J^ v now given by 1 

(3.10b) 

As a result, the kinetic energy of a rigid body (3.9) has the general decomposition 





T — 7trar 



(3.11) 




3 

Xu ->• x' u = RpvXv 

V=1 

1 In general, J depends on time, whenever the x; refer to a fixed reference frame in space and 

when the body rotates with respect to that frame (see Sect. 3.11). 
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with R e SO(3), then 

3 3 

J jiv ^ ■C = EE RfixRvg J xq • (3.14) 

a=i e=t 

Being completely determined by the mass distribution, this tensor is characteristic 
for the rigid body. It is called the inertia tensor. This name reflects its formal 
similarity to the inertial mass (which is a scalar, though). 

The tensor J is defined over a three-dimensional Euclidean vector space V. 
Generally speaking, second-rank tensors are bilinear forms over V. Inertia tensors 
belong to the subset of real, symmetric, (and as we shall see below) positive tensors 
over V. We shall not go into the precise mathematical definitions here. What is 
important for what follows is the transformation behavior (3.14); that is, omitting 
indices J' = RJ R T . 



3.4 Properties of the Inertia Tensor 

In this and the two subsequent sections we study the inertia tensor as a static 
property of the rigid body. This means we assume the body to be at rest or, equiv- 
alently, make use of a coordinate system that is rigidly linked to the body. The 
inertia tensor contains an invariant term that is already diagonal, 

J d 3 xq(x)x 2 S ixv , 

and a term that depends on the specific choice of the intrinsic reference system, 

- J d 3 xe(x)x li x v , 

and that in general is not diagonal. The following properties of the inertia tensor 
can be derived from its definition (3.10). 

(i) J is linear and therefore additive in the mass density q(x). This means that 
the inertia tensor of a body obtained by joining two rigid bodies equals the sum 
of the inertia tensors of its components. Quantities that have this additive property 
are also said to be extensive. 

(ii) J is represented by a real, symmetric matrix that reads explicitly 



/*!+*! 


—X\X2 


-*1*3 \ 




J = / d 3 X 0 (jc) — X 2 X\ 


Xj + x\ 


X 2 X 3 


(3.15) 


\ -*3*1 


-X 3 X 2 


xf + x\) 





Every real and symmetric matrix can be brought to diagonal form by means 
of an orthogonal transformation Rq e SO(3) 
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, o (h ° 0\ 

R 0 JRq 1 = J = 0 h 0 . (3.16) 

\0 0 h) 

In other words, by a suitable choice of the body-fixed system of reference the 
inertia tensor can be made diagonal. Reference systems that have this property are 
again orthogonal systems and are said to be principal-cixes systems. Of course, the 
same representation (3.15) is also valid in a system of principal axes. As J is then 
diagonal, we have 

o r (yl + yi 0 0 \ 

J= d 3 ye(y)\ 0 y 2 + >f 0 , (3.17) 

V 0 o yf + y\j 

from which we derive the following properties of the eigenvalues: 

7/ > 0 , i = 1 , 2, 3 , (3.18a) 

h + h > h , h + h>h, h + h > h ■ (3.18b) 

Thus, the matrix J is indeed positive. Its eigenvalues 7, are called (principal) mo- 
ments of inertia. 

Diagonalization of the inertia tensor is a typical eigenvalue problem of linear 
algebra. The problem is to find those directions co (l \ i — 1, 2, 3, for which 

J&> (0 = 7/w (0 . (3.19) 

This linear system of equations has a nontrivial solution provided its determinant 
vanishes. 



det(J - 7, II) = 0 . 



(3.20) 



Equation (3.20) is a cubic equation for the unknown 7, . According to (3.17) and 
(3.18a) it has three real, positive semidefinite solutions. The eigenvector u> ik] that 
belongs to the eigenvalue f is obtained from (3.19), which is to be solved three 
times, for k = 1, 2, and 3. The matrix Rq in (3.16) is then given by 



/'(I) 
(3) 



CO 



(O' 



(1) 



CO- 



CO 






',( 2 ) a ( 2 ) 



2 

(3) 



ft) 



ft) 



(3.21) 



It is not difficult to show that two eigenvectors <w (,) , u> lk) . which belong to distinct 
eigenvalues 7, and I r- respectively, are orthogonal. For this take the difference 
co <n d6) (k> — With (3.19) this becomes 

= (7, - 7,)(ft> (i) • » (i >) . 



The left-hand side vanishes because J is symmetric. Therefore, if f f 7/, then 
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oo (k) ■ « ( ° = 0 . (3.22) 

It may happen that two (or more) eigenvalues are equal, /, = f. in which case 
we cannot prove the above orthogonality. However, as the system (3.19) is linear, 
any linear combination of u> ln and of 6 ®, say &/' 1 cos u + a) ik> sin a, is also an 
eigenvector of J, with eigenvalue /, = /*. It is then clear that we can always 
choose, by hand, two orthogonal linear combinations. The degeneracy just tells 
us that there is no preferred choice of principal axes. We illustrate this by means 
of the following model. Suppose the inertia tensor, after diagonalization, has the 
form 

o (A o 0\ 

J = 0 A 0 

\0 0 b) 

with A B. Any further rotation about the 3-axis has the form 

/ cos 9 sinO 0\ 

R = [ - sin 9 cos 9 0 J 

\0 0 1 / 

and leaves J invariant. Thus any direction in the (l,2)-plane is a principal axis, 
too, and corresponds to the moment of inertia A. In this plane we choose two 
orthogonal axes. Because B A the third principal axis is perpendicular to these. 

(iii) The inertia tensor and specifically its eigenvalues (the moments of inertia) 
are static properties of the body, very much like its mass. As we shall see below, 
the angular momentum and the kinetic energy are proportional to 4 when the body 
rotates about the corresponding eigenvector M k) . 

A body whose moments of inertia are all different, / 1 ^ 4 ^/ 3 , is said to be 
an asymmetric, or tria.xial, top. If two of the moments are equal, l\ = h f I 3 , 
we talk about the symmetric top. If all three moments are equal, I\ = h — I 3 , 
we call it a spherical top 2 . 

(iv) If the rigid body has a certain amount of symmetry in shape and mass 
distribution, the determination of its center of mass and its principal axes is a lot 
easier. For instance, we have the following. 



Proposition: If the shape and mass distribution of a rigid body is symmet- 
ric with respect to reflection in a plane (see Fig. 3.5), its center of mass 
and two of its principal axes lie in that plane. The third principal axis is 
perpendicular to it. 



Proof. As a first trial choose an orthogonal frame of reference whose 1- and 2-axes 
are in the plane and whose 3-axis is perpendicular to it. For symmetry reasons, to 
any mass element with positive xj there corresponds an equal mass element with 

- This does not necessarily mean that the rigid body has a spherical shape. 
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Fig. 3.5. A rigid body that is symmetric under reflection in the plane shown in the figure 



negative X 3 . Therefore, / d 3 xx 3 g(x) — 0. Comparison with (3.7b) shows that the 
first part of the proposition is true: S lies in the plane of symmetry. Suppose now 
S is found and the system (xj , xi, X3) is centered in S. In the expression (3.15) 
for J the following integrals vanish: 

/ / dV*) W , = o. 

This is so because, for fixed xi (or x%, respectively), the positive values of X 3 
and the corresponding negative values — X 3 give equal and opposite contributions. 
What remains is 

/ J\\ J\2 0\ 

J = I /l2 J22 0 I . 

\o o/ 3 / 

However, this matrix is diagonalizable by a rotation in the plane of symmetry (i.e. 
one about the 3-axis). This proves the second part of the proposition. □ 

Similar arguments apply to the case when the body possesses axial symmetry, 
i.e. if it is symmetric under rotations about a certain axis. In this case the center of 
mass S lies on the symmetry axis and that axis is a principal axis. The remaining 
two are perpendicular to it. The corresponding moments of inertia being degener- 
ate, they must be chosen by hand in the plane through S that is perpendicular to 
the symmetry axis. 

Remark: In calculations involving the inertia tensor the following symbolic no- 
tations can be very useful. 

Let any vector or vector field a over R 3 be written as |a). Its dual which when 
acting on any other vector (field) |c) is denoted (a| and so looks like a kind of 
mirror image of | a). With this notation an expression such as (a\c) is nothing but 
the ordinary scalar product a ■ c. On the other hand, an object such as \b)(a\ is 
a tensor which acts on other vectors c by \b)(a\c) = ( a ■ c)b , thus yielding new 
vectors parallel to b. This is to say that |a) = (a \ , ai- cn,) T is a column vector 
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while its dual (a\ = (a i, a 2 , 03 ) is a row vector. Applying standard rules of matrix 
calculus, one has 

/a A 3 

(b\a) = (b x b 2 b 3 ) I a 2 1 = ^ b k a k = b a , 

\a 3 ) k=\ 

( b\\ (b\a\ b x a 2 b[d 3 

b 2 1 (fli a 2 a 3 ) — I b 2 a\ b 2 a 2 b 2 a 3 

b 3 ) \b 3 ci\ b 3 a 2 b 3 a 3 

For example the definition (3.10b) written in this notation, becomes 

J = /A cW [(,Wl,-W(*|], 

where I 3 is the 3x3 unit matrix. This notation emphasizes the fact that J is an 
object that acts on vectors (or vector fields) and yields as the result another vector 
(or vector field). 

In fact, this notation is the same as Dirac’s “ket” and “bra” notation that the 
reader will encounter in quantum theory. In the older literature on vector analysis 
the tensor |Z»)(fl| was called dyadic product of b and a. 




3.5 Steiner’s Theorem 



Let J be the inertia tensor as calculated according to (3.15) in a body-fixed 
system K with origin S. the center of mass. Let K be another body-fixed 
system which is obtained by shifting K by a given translation vector a, as 
shown in Fig. 3.6. Let J' be the inertia tensor as calculated in the second 
system, 

J'u,v = J d3 x'q{x ') [x% v - 

with x' — x + a. Then J' and J are related by 

J 'nv = J nv + M [« 2< V- - • (3.23) 

In the compact “bracket” notation introduced above, it reads J' = J + J„, 
with J„ = M [(«|«> 1 3 - |a)(a|]. 



The proof is not difficult. Insert x' — x + a into the first equation and take account 
of the fact that all integrals with integrands that are linear in x vanish because of 
the center-of-mass condition (3.7b). In Fig. 3.6 K has axes parallel to those of K. 





196 



3. The Mechanics of Rigid Bodies 




Fig. 3.6. The system K is attached to the center of mass 
S. One wishes to determine the inertia tensor with respect 
to another body-fixed system K , which is centered on 
the point S' 



If K is rotated from K by the rotation R, in addition to the shift, (3.23) generalizes 
to 



/, i; = ^ ' RflO Rvr [joT T" M |^fZ~5(7 T a,y(l z ^ 



or 



cr,r=l 



J' = R(J + J a )R -1 



(3.24) 



The content of this formula is the following. First, K is rotated by R 1 to a position 
where its axes are parallel to those of K. At this point, Steiner’s theorem is applied, 
in the form of (3.23). Finally, the rotation is undone by applying R. 



3.6 Examples of the Use of Steiner’s Theorem 

Example (i) For a ball of radius R and with spherically symmetric mass distri- 
bution g(x) = g(r) and for any system attached to its center, the inertia tensor is 
diagonal. In addition, the three moments of inertia are equal, I\ = I 2 — h = I- 
Adding them up and using (3.17), we find that 

3/ = 2 J Q(r)r 2 d 3 x = 8tt J g(r)r 4 dr 

and therefore 

/ = ^ f R g(r)r 4 dr . 

4 JQ 

We also have the relation 

f R 

M — 4jt g(r)r 2 dr 

Jo 

for the total mass of the ball. If, furthermore, its mass distribution is homoge- 
neous , then 
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3 M 2 , 

p(r) — j , for r < R , and / = - MR ~ . 

v 4jtR 2 “ 5 

Example (ii) Consider a body composed of two identical, homogeneous balls of 
radius R which are soldered at their point of contact T. This point is the center 
of mass and, obviously, the (primed) axes drawn in Fig. 3.7 are principal axes. 
We make use of the additivity of the inertia tensor and apply Steiner’s theorem. 
The individual ball carries half the total mass. Hence its moment of inertia is Iq = 
MR? / 5. In a system centered in T whose 1- and 3-axes are tangent, one ball would 
have the moments of inertia, by Steiner’s theorem, 

/; = /' = /o+_tf 2 ; /'=/„. 

The same axes are principal axes for the system of two balls and we have 

/ M 7 , 

/! = / 3 = 2 U + y = -Mfl 2 , 

2 , 

I 2 = 2I 0 = -^MR 2 . 

Example (iii) The homogeneous children’s top of Fig. 3.8 is another example for 
Steiner’s theorem, in its form (3.23), because its point of support O does not coin- 
cide with the center of mass S. The mass density is homogeneous. It is not difficult 
to show that the center of mass is at a distance 3h/4 from O on the symmetry 
axis. The inertia tensor is diagonal in the unprimed system (centered in S) as well 
as in the primed system (centered in O). The volume is V = 7tR 2 h/3, and the 
density is q = 3M/nR 2 h. Using cylindrical coordinates, 

Xj = r cos q> , x’ 2 — r sin q> , X 3 = z , 




Fig. 3.7. A rigid body consisting of two iden- 
tical balls that are tangent to each other. The 
primed axes are principal axes 



Fig. 3.8. The children’s top is an example of 
the application of Steiner’s theorem 
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the moments of inertia are easily calculated within the primed system. One finds 
that 

/( = /'= e J dV (x'2 + xfj = (3/5 )M (\R 2 + h 2 ) 

/j = e J dV (xf + xf) = (3/10)M/? 2 . 

The moments of inertia in the unprimed system are obtained from Steiner’s theo- 
rem, viz. 



1\ = h — l\ — Ma 2 , with a = 3h/4 , 
and thus 

I { = h = (3/20 )M (r 2 + /; 2 /4) and / 3 = 7j = (3/10)M7? 2 . 

Example (iv) Inside an originally homogeneous ball of mass M and radius R a 
pointlike mass m is placed at a distance d from the ball’s center, 0 < d < R. 
The inertia tensor is an extensive quantity, hence the inertia tensors of the ball and 
of the point mass add. Let a be the distance of the ball’s center to the center of 
mass S, and b be the distance from the point mass to S. With these notations also 
shown in Fig. 3.9a and making use of the center of mass condition mb — Ma = 0 
one finds 

a = md/(m + M) , b — Md/(m + M) = (M /m)a . 

A sytem of principal axes is obvious: Let the line joining the ball’s center and the 
point mass be the 3-axis (symmetry axis), then choose two orthogonal directions 
in the plane orthogonal to the 3-axis through the center. Using Steiner’s theorem 
one has 

2 2 

I 3 = -MR 2 , 

2 

h = I 2 = -MR 2 + Ma 2 + mb 2 

2 t , M, 7 

= -MR 2 + M l H )a 2 . 

5 m 

In view of an application to a toy model to be discussed in Sect. 3.18, we add the 
following remark. Define the ratios 

a d m + M 



A condition that will be of relevance for the analysis of that model will be 



(*) 



(1 - ot)h < I\ = h < (1 + a)/ 3 . 
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In the example worked out here this condition reads 



2 

5 



< 



m + M 
m 



a < 2/5 



which says, when expressed in terms of S, that this parameter should be less than 
2/5. Note that it is the upper limit in (*) which gives this bound. 





(b) 



Fig. 3.9. (a): A point mass is added 
to a ball of homogeneous mass 
density thus changing the original 
spherical top into a symmetric top 
(b): In the same homegeneous ball 
a hole is cut out that makes the top 
a symmetric but no more a spher- 
ical one 



Example (v) Consider the same ball (M, R ) as in the previous example but suppose 
that this time a small hollow sphere is cut out of it whose center B is at a distance 
d from the ball’s center and whose radius is r, cf. Fig. 3.9b. Referring to that figure 
the center of mass now lies below the center of the ball. The mass of the ball which 
is cut out is 

m = (^)' M ■ 

As in example (iv) let a and b denote the distances from the ball’s center to the 
center of mass S, and from the center of the hollow sphere to S, respectively. Then 
d — b — a, and the center of mass condition reads 



Ma + (-m)b = 0 . 



(Remember that the mass and the inertia tensor are extensive quantities. The minus 
sign stems from the fact that one has taken away the mass of the hole!) Choosing 
principal axes like in the previous case the moments of inertia are 

2 , 2 , 

/ 3 = -MR- mr 1 , 

2 5 5 

2 2 

Iy — 1 2 — - MR 2 + Ma 2 mr 2 — mb 2 . 

5 5 

It is interesting to follow up the inequality (*) also in this example. Inserting the 
formulae for the moments of inertia it reads 



2 2 
— a-(MR 2 — mr 2 ) < Ma 2 — mb 2 < a-(MR 2 — mr 2 ) . 
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The middle part is seen to be negative, 

2 7 7 Tf 171 

Ma — mb = —Ma~ , 

m 

so that the inequality should be multiplied by (—1). This yields 

2 / mr 2 \ M — m 2 / mr \ 

5 \ MR 2 / > ~i m “ > _ 5 V 1 _ MR 2 ) ' 

Comparison to example (iv) shows that here the lower bound of (*) gives the 
essential restriction. Converting again to S = d/R one finds 

S < 2/5(l -r 5 /^ 5 ) ■ 

This result will be useful in discussing the toy model of Sect. 3.18 below. 

Example (vi) As a last example we consider a brick with a quadratic cross section 
(side length a \ ) and height 03 whose mass density is assumed to be homogeneous. 
If one chooses the coordinate system K shown in Fig. 3.10, the inertia tensor is 
already diagonal. With £0 = M/aiaiaj, one finds I\ — M(a\ +a 2 )/ 12, cyclic in 
1, 2, 3. The aim is now to compute the inertia tensor in the body-fixed system K , 
whose 3-axis lies along one of the main diagonals of the brick. As a\ =02, we 
find / 1 = A. Therefore, as a first step, one can rotate the 1-axis about the initial 
X 3 -axis by an arbitrary amount. For example, one can choose it along a diagonal 
of the cross section, without changing the inertia tensor (which is diagonal). In 
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a second step, K is reached by a rotation about the x '-, -axis by the angle (p — 
arctan(«i\/2/fl3) 



K — ^K' 

R(Z> 



COS (p 


0 


— situp 


0 


1 


0 


sin (p 


0 


COS (p 



According to (3.24) the relation between the inertia tensors is J = RJR T - J' 
is not diagonal. One finds that 

, , 9 M 4 a? + a? cii + a\ 

J' n = h cos 2 (p + h sin 2 (p = — 1 13 3 



12 



2 



■*22 = *2 = ^(«r + «3 2 ) 



. , , M 2 af + 4 ara 

J 33 = 1 1 sm 2 (p + h cos" (p — — 



2„2 

3 



12 2 flj + 



/' _ f) _ j’ — _ j’ 

J \2 ~ u — J 21 — J 23 — •'32 > 



'32 



. , . , M (a? — a\)a\ai\fl 

J i3 = J 3\ = 7 > - 7 3 sin (p cos (p = — - 

12 2aj- + «3 



The x^-axis is a principal axis; the Xj- and x^-axes are not, with one exception: 
if the body is a cube, i.e. if «i = < 33 , /j 3 and / 31 vanish. Thus, for a homogeneous 
cube, any orthogonal system attached to its center of gravity is a system of principal 
axes. For equal (and homogeneous) mass densities a cube of height a behaves like 
a ball with radius R = a 5 / 1 6tt ~ 0.630a. In turn, if we require the moments 
of inertia to be equal, for a cube and a ball of the same mass M, we must have 
R — aV 5/2\/3 ~ 0.645fl. 



3.7 Angular Momentum of a Rigid Body 

The angular momentum of a rigid body can be decomposed into the angular 
momentum of its center of mass and the relative (internal) angular momentum. 
This follows from the general analysis of the mechanical systems we studied in 
Sects. 1.8-1.12. As we learnt there, the relative angular momentum is independent 
of the choice of the laboratory system and therefore is the dynamically relevant 
quantity. 

The relative angular momentum of a rigid body, i.e. the angular momentum 
with respect to its center of mass, is given by 

n 

L = x r, (3.25a) 

i=i 

if we choose to describe the body by the discrete model (A). For case (B) it is, 
likewise, 
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/ 



L = / crx£(;c)jt x x . 



(3.25b) 



From (3.5) we have i: = co x x. Adopting the continuous version (B) from now 
on, this becomes 



L — J d 3 v£>0c)x x (co x x) = J A 2 xq(x)\x 2 co — (jc • <w)x] . 



The last expression on the right-hand side is just the product of the inertia tensor 
and the angular velocity co, viz. 



L — Jw . 



(3.26) 



Indeed, writing this in components and making use of (3.10b), we have 

3 



J d 3 xe(x)[x 2 8^ v - . 

v=\ J 



(3.260 







Fig. 3.11. The momentary angular velocity co and the angular momentum 
L of a rigid body, in general, do not point in the same direction 



The relation (3.26) tells us that the angular momentum is obtained by applying 
the inertia tensor to the angular velocity. We note that L does not point in the same 
direction as co, cf. Fig. 3.11, unless co is one of the eigenvectors of the inertia tensor. 
In this case 

L = Ijoo , (w||<w (,) ) . (3.27) 

Thereby, the eigenvalue problem (3.19) receives a further physical interpretation: it 
defines those directions of the angular velocity co for which the angular momentum 
L is parallel to co. In this case, if L is conserved (i.e. fixed in space), the top rotates 
about this direction with constant angular velocity. 

The expression (3.13) for the rotational energy can be rewritten by means of 
relation (3.26): 

T rot = \ co ■ L (3.28) 

i.e. 2 7’ rot is equal to the projection of co onto L. If co points along one of the 
principal axes, (3.28) becomes, by (3.27), 
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T m t = \li<» 2 , («|l® (0 ) - (3.29) 

This expression for T m t shows very clearly the analogy to the kinetic energy of 
the translational motion, (3.12). 

To conclude let us write the relationship between angular momentum and an- 
gular velocity by means of the “bracket” notation, 

\L) = J d 3 x [(jc |jc) £3 — |jc)(jc|]|«) = J|w) . 

This formula shows very clearly the action of the 3 x 3-matrix J on the column 
vector | (o) and may be more transparent than the expression (3.26') in terms of 
coordinates. 



3.8 Force-Free Motion of Rigid Bodies 



If there are no external forces, the center of mass moves uniformly along a straight 
line (Sect. 1.9). The angular momentum L is conserved (Sects. 1.10-11), 

— L — 0 . (3.30) 

d t 

Similarly the kinetic energy of the rotational motion is conserved, 




1 d 

2 dr 



(<w«Jto) 



1 d 

2 dr 



(« • L) = 0 . 



(3.31) 



(This follows from conservation of the total energy (Sect. 1.11) and of the total mo- 
mentum. The kinetic energy of translational motion is then conserved separately.) 
We study three special cases. 

(i) The spherical top. The inertia tensor is diagonal, its eigenvalues are degen- 
erate, 1 1 — I 2 = I 3 = /. We have L — 1 /( 0 . As L is constant, this implies that a> 
is constant too, 



L — const =>• 00 = 



1 

— L — const . 
/ 



The top rotates uniformly about a fixed axis. 

(ii) The rigid rod. This is a degenerate top. It is a linear, i.e. one-dimensional, 
rigid body for which 

h = I 2 = I , 
h = 0 , 

where the moments of inertia refer to the axes shown in Fig. 3.12. As it has no 
mass outside the 3-axis, the rod cannot rotate about that axis. From (3.27) we 
have L\ — Ia>\, L 2 — Ico 2 , Lj, — 0. Therefore, leaving aside the center-of-mass 
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Fig. 3.12. The rigid rod as an example of a 
degenerate rigid body 



L 




Fig. 3.13. Example of a symmetric body; 

h = h¥= h 



motion, force-free motion of the rod can only be uniform rotation about any axis 
perpendicular to the 3-axis. 

(iii) The (nondegenerate) symmetric top. This is an important special case and 
we shall analyze its motion in some detail, here and below, using different ap- 
proaches. Taking the 3-axis along the symmetry axis, we have /i = I 2 I 3 . 

Suppose L, the angular momentum, is given. We choose the 1-axis in the plane 
spanned by L and the 3-axis. The 2-axis being perpendicular to that plane, we 
have Li — 0 and hence wi — 0. In other words, ft> is also in the (l,3)-plane, as 
shown in Fig. 3.13. It is then easy to analyze the motion of the symmetric top for 
the case of no external forces. The velocity x — co x x of all points on the symme- 
try axis is perpendicular to the (l,3)-plane (it points “backwards” in the figure). 
Therefore, the symmetry axis rotates uniformly about L. which is a fixed vector 
in space. This part of the motion is called regular precession. It is convenient to 
write as the sum of components along L and along the 3-axis, 

ft) = ft); + ft) pr . (3.32) 

Clearly, the longitudinal component ft)/ is irrelevant for the precession. The compo- 
nent ft) pr is easily calculated from Fig. 3.14a. With &) pr = |ft> pl -| and coi — a> pl sinO, 
as well as &>i = L\/I\ and L\ — \L\ sin 0, one obtains 

\L\ 

fttpr = V • (3.33) 

h 

Because the symmetry axis (i.e. the 3-axis) precesses about L (which is fixed 
in space) and because at all times L, ft), and the 3-axis lie in a plane, the angu- 
lar velocity ft) also precesses about L. In other words, at and the symmetry axis 
rotate uniformly and synchronously about the angular momentum L, as shown in 
Fig. 3.14b. The cone traced out by ft) is called the space cone, while the one traced 
out by the symmetry axis is called the nutation cone. 
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Fig. 3.14a. The angular velocity co is written 
as the sum of its component w/ along the sym- 
metry axis and its component w pr along the 
angular momentum 



Fig. 3.14b. The symmetry axis (x$) of the sym- 
metric top and the momentary angular velocity 
precess uniformly about the angular momen- 
tum, which is fixed in space 



In addition to its precession as a whole, the body also rotates uniformly about 
its symmetry axis. The angular velocity for this part of the motion is 




\L\ cos6 

h 



(3.34) 



Note that the analysis given above describes the motion of the symmetric rigid 
body as it is seen by an observer in the space-fixed laboratory system, i.e. the 
system where L is constant. It is instructive to ask how the same motion appears 
to an observer fixed in the body for whom the 3-axis is constant. We shall return 
to this question in Sect. 3.13 below. 



3.9 Another Parametrization of Rotations: The Euler Angles 

Our aim is to derive the equations of motion for the rigid body. As we stressed in 
the introduction (Sect. 3.1), it is essential to identify the various reference systems 
that are needed for the description of the rigid body and its motion and to distin- 
guish them clearly, at any point of the discussion. We shall proceed as follows. 
At time t — 0, let the body have the position shown in Fig. 3.15. Its system of 
principal axes (below, we use the abbreviation PA for 'principal axes’) K, at t — 0, 
then assumes the position shown in the left-hand part of the figure. We make a 
copy of this system, call it K, keep that copy fixed, and use this as the inertial 
system of reference. Thus, at t = 0, the body-fixed system and the inertial system 
coincide. At a later time t let the body have the position shown in the right-hand 
part of Fig. 3.15. Its center-of-mass, by the action of the external forces, has moved 
along the trajectory drawn in the figure (if there are no external forces, its motion 
is uniform and along a straight line). In addition, the body as a whole is rotated 
away from its original orientation. 
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I t=0 I 



Fig. 3.15. Two positions of a rigid body, at time 
and at t ^ 0. The coordinate system K, which is 
in the body, is translated and rotated 



t = 0 
fixed 



Choose now one more reference system, denoted Ko, that is attached to .S'. its 
axes being parallel, at all times, to the axes of the inertial system K. The actual 
position of the rigid body at time t is then completely determined once we know 
the position r$(t) of the center of mass and the relative position of the PA system 
with respect to the auxiliary system Ko. The first part, the knowledge of rsit), is 
nothing but the separation of center-of-mass motion that we studied earlier, in a 
more general context. Therefore, the problem of describing the motion of a rigid 
body is reduced to the description of its motion relative to a reference system 
centered in S , the center of mass, and whose axes have fixed directions in space. 

The relative rotation from Ko to K can be parametrized in different ways. We 
may adopt the parametrization that we studied in Sect. 2.22, i.e. write the rotation 
matrix in the form R(^»(r)), where the vector <p is now a function of time. We 
shall do so in Sects. 3.12 and 3.13 below. 

An alternative, and equivalent, parametrization is the one in terms of Eulerian 
angles. It is useful, for example, when describing rigid bodies in the framework of 
canonical mechanics, and we shall use it below, in Sects. 3.15-3.16. It is defined as 
follows. Write the general rotation R (r) e SO(3) as a product of three successive 
rotations in the way sketched in Fig. 3.16, 

R(0 = R 3 (K)R„(/6)R3 0 («) • (3-35) 

The coordinate system is rotated first about the initial 3-axis by an angle a. In a 
second step it is rotated about the intermediate 2-axis by an angle fl, and lastly it 
is rotated about the new (and final) 3-axis by an angle y. 

With this choice, the general motion of a rigid body is described by six func- 
tions of time, (rs(t), ait), flit), y(f)}, in accordance with the fact that it has six 
degrees of freedom. Both parametrizations, i.e. by means of 

{r s (0, R(ip(f))} with (pit ) = { (p\ it) , (p 2 (t ), wit)) , 



(3.36) 




3.10 Definition of Eulerian Angles 
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Fig. 3.16. Definition of Eulerian an- 
gles as in (3.35). The second rotation 
is about the intermediate position of 
the 2-axis 



which is the one developed in Sect. 2.21 (2.67), and the one just described, i.e. by 
means of 

{r S (0, 0,(0} with 9i(t) = a(t), d2(t) = m, 03(0 = y(0. (3-37) 

are useful and will be used below. We remark, further, that the definition of the 
Eulerian angles described above is the one used in quantum mechanics. 



3.10 Definition of Eulerian Angles 

Traditionally, the dynamics of rigid bodies makes use of a somewhat different 
definition of Eulerian angles. This definition is distinguished from the previous 
one by the choice of the axis for the second rotation in (3.35). Instead of (the 
intermediate position of) the 2-axis rj, the coordinate frame is rotated about the 
intermediate position of the 1-axis §, 

R(f) = R 3 («OR f (0)R3o(*) • (3-38) 

Figure 3. 17 illustrates this choice of successive rotations. For the sake of clarity, 
we have suppressed the two intermediate positions of the 2-axis. The transforma- 
tion from one definition to the other can be read off Figs. 3.16 and 3.17, which 
were drawn such that Ko and K have the same relative position. It is sufficient to 
exchange 1- and 2-axes in these figures as follows: 

Fig. 3.16 Fig. 3.17 

(2o-axis) — >• (lo-axis) 

(lo-axis) — »• — (2o-axis) 

keeping the 3-axes the same. This comparison yields the relations 
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Fig. 3.17. Another definition of Eulerian an- 
gles, following (3.38). Here the second rotation 
is about the intermediate position of the 1-axis 



0 — a + ^(mod27r) , 9 — /) , 0 — y — ^(mod27r) . (3.39) 

It is easy to convince oneself that the intervals of definition for the Eulerian angles 

0<a<27r, 0 < P < 7t , and 0 < y < 2n (3.40) 

allow us to describe every rotation from Ko to K. If one chooses intervals for 
0 , 9, and 0, 

0 < 0 < 2jt , 0 < 9 < it , and 0 < 0 < 2tx , (3.40') 

it is clear that the additive terms 2n in (3.39), must be adjusted so as not to leave 
these intervals (see the Appendix on some mathematical notions). 



3.11 Equations of Motion of Rigid Bodies 

When the rigid body is represented by a finite number of mass points (with fixed 
links between them), the formulation of center-of-mass motion and relative mo- 
tion follows directly from the principles of Sects. 1.9 and 1.10. If one chooses a 
representation in terms of a continuous mass distribution this is not true a priori. 
Strictly speaking, we leave here the field of mechanics of (finitely many) particles. 
Indeed, the principle of center-of-mass motion follows only with Euler’s general- 
ization (1.99), Sect. 1.30, of (1.8b). Similarly, for finitely many point particles the 
principle of angular momentum is a consequence of the equations of motion (1.28). 
Here it follows only if the additional assumption is made that the stress tensor (to 
be defined in the mechanics of continua) is symmetric. Alternatively, one may in- 
troduce this principle as an independent law. It seems that this postulate goes back 
to L. Euler (1775). 




3.11 Equations of Motion of Rigid Bodies 
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Let P — MV denote the total momentum, with V — rs(f)> and let F be the 
resultant of the external forces. The principle of center-of-mass motion reads 

d " 

— P — F , where F = Y" F (,) . (3.41) 

d t z — ' 

i = 1 

If, in addition, F is a potential force, F = —WU(r s), we can define a Lagrangian, 

L = \Mrl + T m -U(r s ) . (3.42) 

Here, it is important to keep in mind the system of reference with respect to which 
one writes down the rotational kinetic energy. Choosing Ko (the system attached 
to S and parallel to the inertial system fixed in space), we have 

Trot = 2 «(f)j(f)«(I) ■ (3.43a) 

Note that the inertia tensor depends on time. This is so because the body rotates 
with respect to Ko. Clearly, 7 mt is an invariant form. When expressed with respect 
to the PA system (or any other body -fixed system), it is 

T ro t = !®(f)J«(f) ■ (3.43b) 

J is now constant; if we choose the PA system, it is diagonal, J mn = I m 8 mn . In 
(3.43a) the angular velocity a>(f) is seen from Ko, and hence from the laboratory, 
while in (3.43b) it refers to a system fixed in the body. As we learnt in studying 
the free motion of the symmetric top, the time evolution of the angular velocity 
looks different in a frame with axes of fixed direction in space than in a frame 
fixed in the body. 




Fig. 3.18. Rotation of the body-fixed system about the 
3-axis 



In order to clarify the situation with the two kinds of system of reference, we 
consider first the simplified case of a rotation about the 3-axis. Here we have to 
study only the transformation behavior in the (l,2)-plane, Consider first a given 
point A, with fixed coordinates (x\, X 2 , x 3 ) with respect to Ko- If described with 
respect to K, cf. Fig. 3.18, the same point has the coordinates 

x i = Xi cos <p + X 2 sin <p , 

X 2 — —xi sin (p + X 2 cos (p , 
x 3 = x 3 . 



(3.44a) 
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These equations express the passive form of rotation that we studied in Sects. 2. 21 
and 2.22. Take now a point P, fixed on the 1-axis of K, and assume that this latter 
system rotates uniformly with respect to Kq. We then have 

<p = ( p(t ) = cot ; P : xi = a , X 2 = 0 = x$ ; 



and, by inverting the formulae (3.44a), 
x i = xi cos cot — X 2 sinaif , 

X 2 — x i sin cot + X 2 cos cot , (3.44b) 

*3 = X3 , 



so that the point P moves according to (P : x\ — a cos cot, X2 = a sin tut, X3 = 0). 
This is the active form of rotation. Turning now to an arbitrary rotation, we replace 
(3.44a) by 

RW = exp ^ (pi J,-^ , (3.45a) 

where J = { J 1 , Jo , J3 } are the generators for infinitesimal rotations about the 
corresponding axes (cf. (2.73) in Sect. 2.22). Equation (3.44b) is the inverse of 
(3.44a). Hence, in the general case. 



(R(^)r 1 



(R(<P)) T = R(-</0 = exp 




(3.45b) 



The vectors co (angular velocity) and L (angular momentum) are physical quanti- 
ties. They obey the (passive) transformation law 



co — Ro> , L — R L , 



(3.46) 



c 0 and L referring to Ko, co and L referring to K. The inertia tensor J with respect 
to K (where it is constant) and the same tensor taken with respect to Ko (where 
it depends on time) are related by 

J = RJR t , J(f) = R T (f)JR(r) . (3.47) 



This follows from the proposition (3.24), with a — 0. Clearly, T mt is a scalar and 
hence is invariant. Indeed, inserting (3.46) into (3.43b), we obtain (3.43a), 

T rot = ^(Rm)J(Rm) = 2<w(R t JR)« = . 

The equation of motion describing the rotation is obtained from the principle of an- 
gular momentum. With reference to the system Ko, it tells us that the time change 
of the angular momentum equals the resultant external torque. 




(3.48) 




3.12 Euler’s Equations of Motion 
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Thus, adopting the discrete model (A) for the rigid body, we have 



n 



I'- 

ll 

iiM 

2 

X 


(3.49) 


n 

d = J2 x(i) x f(i> ■ 

i= 1 


(3.50) 



In summary, the equations of motion (3.41) and (3.48) have the general form 

(3.51) 

(3.52) 

where (3.51) refers to the inertial system of reference and (3.52) refers to the sys- 
tem Ko, which is centered in S and has its axes parallel to those of the inertial 
system. 



Mr s (t) = F(r s , r s , 0j, 0;, t ) , 
L = D(r s , r s, 0;, 0/, t) , 



3.12 Euler’s Equations of Motion 

In this section we apply the equation of motion (3.52) to the rigid body and, in 
particular, work out its specific form for this case. Inverting the second equation 
of (3.46) we have 

L = R T (r)L = J(f)«(f) . 

Differentiating with respect to time, we obtain 

L — J J co 

and, by means of (3.47), also 

J(f) = — [ R T (r) JR(/)1 = R t JR + R t JR . 
at L J 

If we again replace J by J in this last expression, this becomes 
J(f) = (R t R)J + J(R t R) . 

We now study the specific combination of the rotation matrix and the time 
derivative of its transpose that appears in the time derivative of J (?) . Let us define 

Hpf • t * — 1 

12(0 = R (t)R(O = R (f)R(O • (3.53) 

The transpose of this matrix, 12 T = R T R, is equal to —12. This follows by taking 
the time derivative of the orthogonality condition R T R = 11, whereby 
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r t r + r t r = o, 

and hence 

SI + St T = 0 . 



Thus, we obtain 

J = flj + Jtt T = flJ - Ji2 = [12, J] , (3.54) 

def 

where [,] denotes the commutator, [A, B] — AB — BA. In order to compute the 
action of SI on an arbitrary vector, one must first calculate the time derivative of 
the rotation matrix (3.45a). The exponential is to be understood as a shorthand for 
its series expansion, cf. Sect. 2.20. Differentiating termwise and assuming ip to be 
parallel to ip, one obtains 



d 
d t 



R(<p(r)) = - 



“ 3 

./=! 



R i<p{t)) = - 



- 3 



u=i 



R . 



(3.55) 



From Sect. 2.22 we know that the action of (JT =1 &>, J) on an arbitrary vector b 
of R 3 can be expressed by means of the cross product, viz. 




b = co x b . 



Obviously, an analogous formula applies to the inverse of R, 



j-R T (<P(0) = 
at 




R t W0) ■ 



Therefore, taking b = R T a, we have 

R T (r)a — co x (R T a) , (3.56a) 

and from this 

Si{t)a — R t R« — co x (R t R«) = cox a . (3.56b) 

This gives us 

L — J (>) -j- J io = (fij — J S2)co 4~ J io 

— S2Jco + Jw = wx (Jm) + Jw = co x L + Jw , (3.57) 

where we have used the equation Sico — 0, which follows from (3.56b). It remains 
to compute <b , 



io — — (R t w) = R r « + R r w . 
dr v ' 
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The second term vanishes because, by (3.56a), 

R t w = co x (R T <u) = 00 X CO — 0 . 

Thus, co — R T w. Inserting (3.57) into the equation of motion (3.52), we obtain 
L — co x L + JR T 6> = D . 



This form of the equations of motion has the drawback that it contains both quan- 
tities referring to a system of reference with space-fixed directions and quantities 
referring to a body-fixed system. However, it is not difficult to convert them com- 
pletely to the system fixed in the body: multiply these equations with R from the 
left and note that R(o> x L ) = Ro> x RL — co x L. In this way we obtain Euler’s 
equations in their final form. 



J co + co x L — D . 



(3.58) 



All quantities now refer to the body-fixed system K. In particular, J is the 
(constant) inertia tensor as computed in Sect. 3.4 above. If the intrinsic system 
K is chosen to be a PA system, J is diagonal. Finally, we note that L — J co. 
This shows that the equations of motion (3.58) for the unknown functions co(t) 
are nonlinear. 



Remark: Because of its antisymmetry, the action of the matrix SI on any vector 
a is always the one given in (3.56b), with co to be calculated from the rotation 
R(f) = exp{— S}, with S = ^ <Pi(t) J,. We have 

SI = R T (f)R(r) = (— e S 
\ dt 

= (s+ l -ss+ l -ss + ..)j (1-S + ...) 

= S+^[S,S] + 6% 2 ). 

Making use of the commutators [J/, J/] = e ijk$k one derives the identity 
[S, S] = [S(<»), S(<p)] = S(c/> x ip ) , 

from which one computes co. If we make the assumption (as we did in Sect. 2.22 
and also in this section above) that cp — <p h and ip have the same direction, then 
S commutes with S so that co and ip coincide. 

One may, of course, also consider a situation where both the modulus and the 
direction of cp change with time. In this case cp and ip are no longer parallel, and 
S and S no longer commute. It then follows from (3.56b), from S a — <p x a. and 
from the calculation above that 

1 2 
CO — ip-\--cpxip + 0(cp ) . 
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In very much the same way one shows that 

~ 1 2 
<•> = <?- -<P x ip + 0(q> ). 

In deriving Euler’s equations the difference to the situation where ip and <p are 
taken to be parallel is irrelevant because we may always add a constant rotation 
such that the modulus <p of <p is small, and oo & ip (see also Sect. 5.7.4). 

3.13 Euler’s Equations Applied to a Force-Free Top 

As a first illustration of Euler’s equations we study the force-free motion of rigid 
bodies. If no external forces are present, the center of mass, by (3.51), moves with 
constant velocity along a straight line. The right-hand side of Euler’s equations 
(3.58) vanishes, D — 0. If K is chosen to be the PA system, then J,u = h&ik and 
L, — Ijibi, so that (3.58) reads 

Iiibi + (w x L)i = 0 . 

More explicitly, because (w x L)\ = 1 3022 033 — ho >3 0>2 — (h ~ h)o 32 0>3 (with 
cyclic permutation of the indices), the equations of motion read 

hm — (h — 13 ) 0320)3 , 

1 2032 = (I 3 ~ I l)03303l , 

13033 = (^1 — I2)o3\032 ■ (3.59) 

(i) The asymmetric or triaxial top. Here I\ ^ h 3 ^ I 3 3 ^ l\ ■ The equations (3.59) 
being nonlinear, their solution in the general case is certainly not obvious. Yet, as 
we shall see below, their solution can be reduced to quadratures by making use of 
the conservation of energy and angular momentum. Before turning to this analysis, 
we discuss a qualitative feature of its motion that can be read off (3.59). Without 
loss of generality we assume the ordering 

h < I 2 < h ■ (3.60) 

Indeed, the principle axes can always be chosen and numbered in such a way 
that the 1-axis is the axis of the smallest moment of inertia, the 3-axis that of 
the largest. The right-hand sides of the first and third equations of (3.59) then 
have negative coefficients, while the right-hand side of the second equation has a 
positive coefficient. Thus, the stability behavior of a rotation about the 2-axis (the 
one with the intermediate moment of inertia) will be different, under the effect of 
a small perturbation, from that of rotations about the 1- or 3-axes. Indeed, in the 
latter cases the rotation is found to be stable, while in the former it is unstable, 
(see Sect. 6.2.5). 

dcf — 

We now set x(t) = opit) and make use of the two conservation laws that hold 
for free motion: 
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3 

27’ rot = ^2 A ' Y = const , (3.61) 

i=i 

3 

L 2 — Yjlmf — const . (3.62) 

i=i 

Taking Ihe combinations 

L 2 - 2r rot /i = h{h - + h{h - /i)x 2 , 

L 2 - 2 Trot / 2 = -/I (/ 2 - 7i)«? + / 3 (/3 - /2)x 2 , 
we deduce the following equations: 

a>i = ~ I{) \ L 2 _ 2T mih ~ h{h ~ / 2)-t 2 ] = -«o + a2 ^ 2 , 

"2 = 7 „ 1 7T W - 27’rot /l - / 3 (7 3 - 7 i)x 2 1 = ft, - ft* 2 - 

With the convention (3.60), all differences of moments of inertia are written so as 
to make the coefficients ao, cti, />()■ Pi positive. Inserting these auxiliary relations 
into the third equation of (3.59) yields the differential equation 

hx(t) = (/j - h)ij (Aj - Pix 2 ) (— «o + aix 2 ) (3.63) 

for x(t). It can be solved by separation of variables and hence by ordinary integra- 
tion (quadrature). Clearly, co\(t) and &n(f) obey analogous differential equations 
that are obtained from (3.63) by cyclic permutation. 

(ii) The symmetric top. Without loss of generality we assume 

h — h ^ I 3 and I\ ^ 0 , I 3 ^ 0 . (3.64) 

The solution of the equations (3.59) is elementary in this case. First, we note that 
730)3 = 0 , i.e. o) 3 = const . 



Introducing the notation 



coo — a>3 (= const) 

h 

we see that the first two equations of (3.59) become 
ft) 1 = —u>ou >2 and cbi — coqwi , 



(3.65) 



their solutions being 

ft)j(f) = ft)j_ cos(&)of + r) , W 2 (t) — co |_ sin(tt»ot + r) . 



(3.66) 
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Here co± and r are integration constants that are chosen at will, co o, in turn, is 
already fixed by the choice of the integration constant cb 3 in (3.65). As a result, 
one obtains 



t o — (&>_ l cos(&>o t + r) , cu_L sin(tt>o? + r) , m3) 



and or = + a> 2 . The vector a> has constant length: it rotates uniformly about 

the 3-axis of the PA system. This is the symmetry axis of the top. 

As to the angular momentum with respect to the intrinsic system, one has 



L\ — I\oo±_ cos(tt>o? + t) , 
Z 2 = sin(&)of + r) , 
L 3 = , 

L 2 = lfar ± + ijw 3 . 



(3.67) 



This shows that L rotates uniformly about the symmetry axis, too. Furthermore, 
at any time the symmetry axis/, a> and L lie in one plane. 

It is not difficult to work out the relation of the constants of integration co±, 
C 03 (or (do ) to the integrals of the motion that are characteristic for motion without 
external forces: the kinetic energy r rot and the modulus of angular momentum. 
One has 



2T rot = A'*/ — hw\ + /3W5 = I\ 



i = 1 



r 2 r 2 t 2 2 _i_ r 2 - 2 j2 

Lj — Lj — 1 1 (0 | “I - 1 ^ — 1 -y 



CO 2 : + 



hh 2 

COn 






CO 2 , + 






from which one obtains 

»i = 7 

«0 = [ 2/1 ^rot - L 2 ] . 

HO L J 



(3.68) 

(3.69) 



Finally, one may wish to translate these results to a description of the same motion 
with respect to the system Ko of Fig. 3.15. Because there are no external forces, 
this is an inertial system. Denoting the symmetry axis of the top by / (this is the 
3-axis of K), the same unit vector, with respect to Ko, depends on time and is 
given by 



fit) = R T (r)/ . 

Since L, o>, and / are always in a plane, so are L = R 1 /, u> — R T w,/. Being 
conserved, the angular momentum L is a constant vector in space, while co and / 
rotate uniformly and synchronously about this direction. 
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Fig. 3.19. For a free symmetric top the angular momentum L is conserved and 
hence fixed in space. The axis of symmetry /, the angular velocity co, and L 
always lie in a plane./ and co perform a uniform precession about L 



As shown in Fig. 3.19, we call 9\, 62 the angles between L and co and between 
<0 and /, respectively, and let 



6 0\ + 62 



We show that cos 9 and cos 0 2 must always have the same sign. This will help us 
to find the possible types of motion. We have 

2 r ro t 

27 , rot — L ■ co and cos$i = — . 

\L\\co\ 

As 7i- ot is conserved and positive, cos 0] is constant and positive. Thus 

-f <^i<f • 

Furthermore, making use of the invariance of the scalar product, we have 
«•/ = &>•/ = M COS 02 = (03 = L-i/h 
= Z-///3 = L-/// 3 = |L|cOS0//3 • 

It follows, indeed, that cos 9 and cos O 2 have the same sign, at any moment of the 
motion. As a consequence, there can be only two types of motion, one where this 
sign is positive (Fig. 3.20a) and one where the sign is negative (Fig. 3.20b), 9\ being 
constrained as shown above, — j < 0\ < * . Figure 3.20 shows the situation for 
I 3 > I\ = I 2 , i.e. for a body that is elongated like an egg or a cigar. If I 3 < I\ = h, 
i.e. for a body that has the shape of a disc or a pancake, the angular momentum 
lies between co and the symmetry axis of the top. Finally, we can write down one 
more relation between the angles 6 and 0 2 . Take the 1-axis of the intrinsic system 
in the plane spanned by L and/ (as in Sect. 3.8 (iii)). Then we have 

Li/U = tan0 = /1W1//3W3 = (/i// 3 )tan0 2 • 

(iii) A practical example: the Earth. To a good approximation the Earth can be 
regarded as a slightly flattened, disklike, symmetric top. Its symmetry axis is de- 
fined by the geographic poles. Because its axis of rotation is slightly inclined, by 
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Fig. 3.20. The two types of motion of the axis of symmetry f and the angular velocity oo about 
the angular momentum L 



about 0.2", with respect to the symmetry axis, it performs a precession motion. 
Neglecting the external forces acting on the Earth, we can estimate the period of 
this precession as follows. We have 

I\ = h< h with (I 3 - I\)/h - 1/300 . (3.70) 

The frequency of precession is given by (3.65). Thus, the period is 

2n 2nl\ 

T = — = . 

(h — h)(» 3 

Inserting 2tc /a> 3 = 1 day and the ratio (3.70) one gets T ~ 300 days. Experimen- 
tally, one finds a period of about 430 days and an amplitude of a few meters. The 
deviation of the measured period from the estimate is probably due to the fact that 
the Earth is not really rigid. 

In fact, the Earth is not free and is subject to external forces and torques exerted 
on it by the Sun and the Moon. The precession estimated above is superimposed 
upon a much longer precession with a mean period of 25 800 years (the so-called 
Platonic year). However, the fact that the free precession estimated above is so 
much faster than this extremely slow gyroscopic precession justifies the assumption 
of force-free motion on which we based our estimate. 



3.14 The Motion of a Free Top and Geometric Constructions 

The essential features of the motion of a free, asymmetric rigid body can be un- 
derstood qualitatively, without actually solving the equations (3.63), by means of 
the following constructions. The first of these refers to a reference system fixed 
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in space; the second refers to the intrinsic PA system and both make use of the 
conservation laws for energy and angular momentum. 

(i) Poinsot’s construction ( with respect to a space-fixed system). The conservation 
law (3.61) can be written in two equivalent ways in terms of quantities in the 
reference system fixed in space, 

2T ml — (o(t) ■ L — a){t)j(t)oo(t) — const . (3.71) 



As L is fixed, (3.71) tells us that the projection of a>{t) onto L is constant. Thus, 
the tip of the vector oo(t) always lies in a plane perpendicular to L. This plane is 
said to be the invariant plane. The second equality in (3.71) tells us, on the other 
hand, that the tip of co(t) must also lie on an ellipsoid, whose position in space 
changes with time, viz. 

3 

y ' fk U )(»i it )(iy (/ ) = 2T mt . 
i,k= 1 

These two surfaces are shown in Fig. 3.21. As we also know that 

3 

2 7r 0 t = y ' Ij d) i , 

1=1 



the principal diameters a,- of this ellipsoid are given by a, = f2T m{ / 1, . For fixed 
energy, the ellipsoid has a fixed shape. 



E 



i=i 



at 



2T mt /Ii 



= 1 . 



(3.72) 



When looked at from the laboratory system, however, the ellipsoid moves as a 
whole. To understand this motion, we note the relation 



dT mt 

da>i 



1 d 

2 da>i 



Y: (OkJkltai 



U,/=t 



3 




m = 1 




Fig. 3.21. The tip of o)(t ) wanders on the invariant plane and 
on a time-dependent ellipsoid tangent to that plane 
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which tells us that L — V w 7’ rot . Thus, at any moment, L is perpendicular to the 
tangent plane to the ellipsoid at the point P. In other words, the invariant plane 
is tangent to the ellipsoid. The momentary axis of rotation is just co(t). Therefore, 
the motion of the ellipsoid is such that it rolls over the invariant plane without 
gliding. In the course of the motion, the point P traces out two curves, one on the 
invariant plane and one on the ellipsoid. These curves are somewhat complicated 
in the general case. In the case of a symmetric body with, say, Ii = h > h , they 
are seen to be circles. 

(ii) General construction within a principal-axes system. Using a PA system, we 
have Li = /, &>; , so that the conserved quantities (3.61-3.62) can be written as 
follows: 

2 Trot = J2-T’ (3 ' 73) 

i=l li 

L 2 = J2Lj. (3.74) 

i — I 

The first of these, when read as an equation for the variables L\, Z2, Z3, describes 
an ellipsoid with principal axes 

at = v / 27’ rot / ; , ( = 1,2,3. (3.75) 

With the convention (3.60) they obey the inequalities a\ < 02 < <23- From (3.60) 
one also notes that 

2T m h <L~ = L 2 < 2W3 ■ (3.76) 

The second equation, (3.74) is a sphere with radius 

R — sfT? and a\ < R < 03 . (3.77) 

Taking both equations together, we conclude that the extremity of L (this is the 
angular momentum as seen from the body-fixed PA system) moves on the curves 
of intersection of the ellipsoid (3.73) and the sphere (3.74). As follows from the 
inequalities (3.76), or equivalently from (3.77), these two surfaces do indeed inter- 
sect. This yields the picture shown in Fig. 3.22. As the figure shows, the vector L 
performs periodic motions in all cases. One also sees that rotations in the neigh- 
borhood of the 1-axis (principal axis with the smallest moment of inertia), as well 
rotations in the neighborhood of the 3-axis (principal axis with the largest moment 
of inertia) are stable. Rotations with L close to the 2-axis, on the other hand, look 
unstable. One is led to suspect that even a small perturbation will completely upset 
the motion. That this is indeed so will be shown in Sect. 6.2.5. 
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Fig. 3.22. The angular momentum L, as seen from a reference system fixed in the body, moves 
along the curves of intersection of the spheres (3.74) and of the ellipsoids (3.73) 



3.15 The Rigid Body in the Framework 
of Canonical Mechanics 

The aim of this section is to derive once more the equations of motion of rigid 
bodies, this time by means of a Lagrangian function that is expressed in terms of 
Eulerian angles. In a second step we wish to find the generalized momenta that are 
canonically conjugate to these variables. Finally, via a Legendre transformation, 
we wish to construct a Hamiltonian function for the rigid body. 

(i) Angular velocity and Eulerian angles. In a first step we must calculate the 
components of the angular velocity u> with respect to a PA system, following (3.35), 
and express them in terms of Eulerian angles as defined in Sect. 3.10. A simple, 
geometric way of doing this is to start from Fig. 3.16 or 3.23. To the three time- 
dependent rotations in (3.35) there correspond the angular velocities co a , (Op, and 
coy. Here, co a points along the 3o-axis, cop along the axis Sr], and co Y along the 
3-axis. If 1, 2, and 3 denote the principal axes, as before, and if (co a )i denotes 
the component of co a along the axis i, the following decompositions are obtained 
from Fig. 3.23: 



(w J g) 1 = /Isiny , {&>p) 2 — P cos y , (cop) 3 = 0, 


(3.78) 


= a cos p , («„ = —a sin ft , 


(3.79a) 


from which follows 


(w a ) 1 = —a sin ft cosy , (w„) 2 = a sin ft siny , 


(3.79b) 


and finally 


(fi» y )j=0, (w y ) 2 = 0, ( w y) 3 — Y ■ 


(3.80) 



Thus, the angular velocity co — (o a +(Op+co y is given by co — (a>\, co 2 , cd ?,) 7 with 
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Fig. 3.23. Construction that helps to express co by the time derivatives of the Eulerian angles. 
Definition as in Fig. 3.16 

d>i — 4 sin y — a sin ft cos y , 

d >2 — P cos y + a sin ft sin y , (3.81) 

<J >3 = a cos ft + y . 

It is easy to translate these results to the definition of Eulerian angles as given 
in Sect. 3.10. The transformation rules (3.39) tell us that in (3.81) cosy must be 
replaced by — sin 0 and sin y by cos 0. giving 

a)i—9 cos 4* + 0 sin 6 sin 4* , 

0)2 — —0 sin 0 + 0 sin 9 cos 0 , (3.82) 

0 ) 2 , = 0 cos 6 + 0 . 

The functions <w, (f) obey the system of differential equations (3.58), Euler’s equa- 
tions. Once they are known, by inverting (3.82) and solving for 0,9, and 0, one 
obtains the following system of coupled differential equations 

<p = [clq sin 0+ci)2 cos ^ ] / sin 9 , 

9 = 0 ) i cos 0 — a >2 sin 0 , (3.83) 

0 — 0)2 — [<u i sin 0 + a >2 cos 0 ] cot 9 . 

The solutions of this system {0{t), 9(t), 0(1)} describe the actual motion com- 
pletely. 

Making use of (3.82) we can construct a Lagrangian function in terms of Eu- 
lerian angles. Its natural form is 



L = T — U , 



(3.84) 
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where the kinetic energy is given by 

3 



1^,1. 

T = T M = -£ /,•«?= -h (6 



cos + 0 sin 9 sin 0 ) 



i = 1 



1 . , 1 . • , 

+ ~h(— 0 sin V + <t> sin$ cos 0)~ + -hi'L + 0 cos 9)~ . (3.85) 



Note that we use the second definition of Eulerian angles (see Sect. 3.10 and 
Fig. 3.17) and that we assume that the center-of-mass motion is already separated 
off. 

The first test is to verify that L , as given by (3.84), yields Euler’s equations 
in the form (3.59) when there are no external forces (U — 0). We calculate 



dL 

d4> 

dL 



dT da >3 
da ) 3 d4* 
dT 9 chi 
9 oil d*P 



^30)3 , 

dT du > 2 
da >2 d 0 



(/l — h)w\a>2 



Indeed, the Euler-Lagrange equation (d/d r)(9L/9<7 / ) = dL/d& is identical with 
the third of equations (3.59). The remaining two follow by cyclic permutation. 



(ii) Canonical momenta and the Hamiltonian function. The momenta canonically 
conjugate to the Eulerian angles are found by taking the partial derivatives of L 
with respect to 0 . 9, and 0 . The momentum pip is the easiest to determine: 

pq, = f -r- — + <i> cos 9 ) = L3 = L ■ £3 

= Li sin0 sin <P — L 2 sin$ cos 0 + L 3 cos 9 . (3.86) 



In the last step, < 13 . the unit vector along the 3-axis, is written in components with 
respect to Ko (whose axes are fixed in space). The momentum pq> is a little more 
complicated to calculate, 



def dL -A dT dd)j 

- d& - “ dcbi d& 
1=1 

~ L ' ® 3 0 = L 3 



1 1 coi sin 9 sin 'L + I 2 CO 2 sin 9 cos 0 + I 3 CO 3 cos 9 

(3.87) 



Here, we made use of the equation L, = /, <77, and of the fact that (sin 9 sin 0. sin 9 
cos 0 . cos 6 ) is the decomposition of the unit vector C 3 0 along the principal axes. 
Finally, the third generalized momentum is given by 



def dL - 
Pe — — - = L 1 cos w 
^ att 



- L 2 sin 0 = L 



(3.88) 



where cp is the unit vector along the line St; of Fig. 3.17. One verifies that 



det 



d 2 T \ 
d9id9 k ) 



#0, 
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which means that (3.86-3.88) can be solved for the o >, , or, equivalently, for the 
Lj. After a little algebra one finds 

L\ — (p 0 — pg, cost?) sinip + pg cos'P , 

sin 6 

L 2 — (p<p — pg, cos0) cos P — pg sin<P , (3.89) 

sin 6 

L3 — P'P ■ 

With T — (£] Z?//,-)/2 this allows us to construct the Hamiltonian function. One 
obtains the expression 



H = 



(p& — P 0 cos df 




sin 2 cos 2 'P 



h 



h 



sin 2 P 

h 



(3.90) 



Pe{p& ~ P* cos 9) ^ ^ 



k pl+u - 



We note that pg, is the projection of the angular momentum onto the space-fixed 3o 
axis, while pg> is its projection onto the body-fixed 3-axis. If the potential energy 
U does not depend on & , this variable is cyclic, so that pg, is constant, as expected. 
The expression (3.90) simplifies considerably in the case of a symmetric top for 
which we can again take 7j = I 2 , without loss of generality. If U vanishes, or 
does not depend on P, then <P is also cyclic and pg/ is conserved as well. 



(iii) Some Poisson brackets. If the Eulerian angles are denoted generically by 
{®;(r)}, the Poisson brackets over the phase space (with coordinates 0/ and ps t ) 
are given by 



{/, g}( 0 /, p 0i ) = 



i = 1 



/ 9/ 9g 

2^ l at a. 



9p0j 90 ; 



9/ 9g \ 

90! dp®, ) 



(3.91) 



The components of the angular momentum with respect to the systems K and Ko 
have interesting Poisson brackets, both within each system and between them. One 
finds 



Lu L 2 \ = — L 3 (cyclic). 


(3.92) 


L\, Z 2 } = +Z 3 (cyclic), 


(3.93) 


Li, Lj{ = 0 for all i and j . 


(3.94) 



Note the remarkable signs in (3.92) and (3.93). Finally, one verifies that the brack- 
ets of the kinetic energy with all L/, as well as with all Z, , vanish, 

{U, T) =0 = {Li, T} , i = 1,2,3 . 



(3.95) 
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3.16 Example: The Symmetric Children’s Top 
in a Gravitational Field 

The point of support O does not coincide with the center of mass S. their distance 
being 



OS = l . 



Therefore, if /i (= I 2 ) is the moment of inertia for rotations about an axis through 
S that is perpendicular to the symmetry axis (the 3-axis in Fig. 3.24), then Steiner’s 
theorem. Sect. 3.5, tells us that 

l\ = 1^=1)+ Ml 2 

is the relevant moment of inertia for rotations about an axis through O that is 
also perpendicular to the symmetry axis. I[ and /j were calculated in Sect. 3.6 
(iii) above. Since /[ — I j, the first two terms in 7j- 0 t (3.85) simplify, so that the 
Lagrangian function for the spinning top in the earth’s gravitational field is given 

t>y 



L — 5 ^/i + M/ 2 ^ ( 6 2 + <P 2 sin 2 0) + + <P cos 0 ) 2 — Mgl cos 9 . (3.96) 

The variables 0 and 0 are cyclic, the momenta conjugate to them are conserved, 
pq/ = Z 3 = hOP + 0 cos 9) = const , (3.97a) 

p<p = Lj = (^ l[ sin 2 0 + I 3 cos 2 0^ 0 + 1^0 cos 9 — const . (3.97b) 

As long as we neglect frictional forces, the energy is also conserved, 

E — bl[[0 2 + 0 2 sin 2 0) + \h(0 + 0 cos 9)“ + Mgl cos 6 — const . (3.98) 




Fig. 3.24. The symmetric children’s top in a gravitational field 
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From (3.97) we can isolate <P and viz. 




L3 — L 3 cos 9 ■ L3 

(p — , , V - 0 cos 9 . 

l\ sin 2 9 h 


(3.99) 


Inserting these expressions into (3.98) we obtain an equation of motion that con- 
tains only the variable 9(t). With the new abbreviations 


E' = f E - ^ - Mgl , 
2/3 


(3.100) 


def (Li — Li cos 9) 2 

u ef((0) = L — ± y~ - MgK 1 - cos 9) , 

2I[ sin 2 9 


(3.101) 



(3.98) becomes the effective equation 



E' = \l[Q 2 + U e ff (0) = const, (3.102) 

to which one can apply the methods that we developed in the first chapter. Here, 
we shall restrict the discussion to a qualitative analysis. 

From the positivity of the kinetic energy, the physically admissible domain of 
variation of the angle is determined by the condition E' > U e s(9). Whenever L3 
differs from Z3, ( 7 e rr tends to plus infinity both for 9 -> 0 and for 0 -> ir. Let 

u(t)= cos0(r) (3.103) 

and therefore 6 2 — m 2 /(1— w 2 ). Equation (3.102) is then equivalent to the following 
differential equation for u(t): 

ir = f(u ) , (3.104) 

where 



/(«)= (1 - « 2 )[(2£7/0 + 2Mgl(\ - «)//{] - (L 3 - Z 3 n) 2 //f . (3.105) 

Only those values of w(f) are physical which lie in the interval [—1, +1] and for 
which f(u) > 0. The boundaries n = 1 or u = — 1 can only be physical if in the 
expression (3.105) L3 = Z3 or L3 = — Z3. Both conditions of motion (the top 
standing vertically in the first case and being suspended vertically in the second 
case) are called sleeping. In all other cases the symmetry axis is oblique compared 
to the vertical. 

The function f(u) has the behavior shown in Fig. 3.25. It has two zeros, u 1 
and M2, in the interval [—1, +1]. For u\ <u < 112 , f(u ) > 0. The case u\ — 112 
is possible but arises only for very special initial conditions. The motion in the 
general case u\ < 112 can be described qualitatively quite well by following the 

dcf — 

motion of the symmetry axis on a sphere. Setting u 0 = L3/L3, the first equation 
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Fig. 3.25. Graph of the function f(u) 
(3.105) with u = cos 6{t). See also 
Practical Example 1 of the Appendix 



(3.99) gives 




;<0 — u 
1 — u- 



(3.106) 



Thus, whenever u i / U 2 , the extremity of / moves on the unit sphere between 
the parallels of latitude defined by 



0i = arccos m; , i — 1,2. 



Depending on the position of uq relative to u\ and u 2 , we must distinguish three 
cases. 

(i) uq > 112 (or uq < «i). From (3.106) we see that <P always has the same 
sign. Therefore, the motion looks like the one sketched in Fig. 3.26a. 

(ii) mi < mo < M 2 . In this case <P has different signs at the upper and lower 
parallels. The motion of the symmetry axis / looks as sketched in Fig. 3.26b. 

(iii) mo = Mi or uq — 112 . Here <t> vanishes at the lower or upper parallel, respec- 
tively. In the second case, for example, the motion of / is the one sketched 
in Fig. 3.26c. 

The motion of the extremity of / on the sphere is called nutation. 



3.17 More About the Spinning Top 

The analysis of the previous section can be pushed a little further. For example, 
one may ask under which condition the rotation about the vertical is stable. This 
is indeed the aim when one plays with a children’s top: one wants to have it spin, 
if possible vertically, and for as long as possible. In particular, one wishes to know 
to what extent friction at the point of support disturbs the game, 

(i) Vertical rotation (standing top). For 0 — 0 we have L 3 = Z3. From (3.101) one 
finds that (/ e ff(0) = 0 and therefore E' = 0 or E — Z^/ 2/3 + Mgl. The rotation 
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Fig. 3.26. Symmetric children’s top in the gravitational field. The 
figure shows the nutation of the extremity of the symmetry axis/ 



is stable only if U e ff(6) has a minimum at 0 — 0. In the neighborhood of 0 — 0 
we have 



U efl ~ [Z|/8/( - Mgl/ 2] 6 



The second derivative of C/ e ff 



is positive only if L\ > 4 Mgll[ or 



d\ > 4 Mgll[/I$ . 



(3.107) 



(ii) Including friction. The motion in the presence of frictional forces can be 
described qualitatively as follows. Consider a top in an oblique position with 
pip = Z 3 > p,p = Z. 3 . The action of friction results in slowing down pip continu- 
ously, while leaving p& practically unchanged until the two are equal, pip — p&. 
At this moment the top spins vertically. From then on both p,p and pp decrease 
synchronously. The top remains vertical until the lower limit of the stability con- 
dition (3.107) is reached. For dp below that limit the motion is unstable. Even a 
small perturbation will cause the top to rock and eventually to topple over. 
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3.18 Spherical Top with Friction: The “Tippe Top” 

The tippe top is a symmetric, almost spherical rigid body whose moments of in- 
ertia 7i = h and 7 3 ^ 7i fulfill a certain inequality, see (3.112) below. It differs 
from a homogeneous ball essentially only in that its center of mass does not coin- 
cide with its geometrical center. If one lets this top spin on a horizontal plane in 
the earth’s gravitational field and includes friction between the top and the plane 
of support, it behaves in an astounding way. Initially it spins about the symmetry 
axis of its equilibrium position such that its center of mass is below the center 
of symmetry. The angular momentum points in an almost vertical direction and, 
hence, is nearly perpendicular to the plane. By the action of gliding friction, how- 
ever, the top quickly inverts its position so that, in a second stage, it rotates in an 
“upside-down” position before eventually coming to rest again. After this rapid 
inversion the angular momentum is again vertical. This means that the sense of 
rotation with respect to a body-fixed system has changed during inversion. As the 
center of mass is lifted in the gravitational field, the rotational energy and therefore 
the angular momentum have decreased during inversion. When the top has reached 
its upside-down position, it continues spinning while its center of mass is at rest 
with respect to the laboratory system. During this stage only rotational friction is 
at work. As this frictional force is small, the top remains in the inverted position 
for a long time before it slows down and returns to the state of no motion. 

Although this toy was apparently already known at the end of the nineteenth 
century, it was thought for a long time that its strange behavior was too complicated 
to be understood analytically and that it could only be simulated by numerically 
solving Euler’s equations. This, as we shall see, is not true. Indeed, as was shown 
recently, the salient features of this top can be described by means of the analytic 
tools of this chapter and a satisfactory and transparent prediction of its strange 
behavior is possible. This is the reason why I wish to add it to the traditional list 
of examples in the theory of spinning rigid bodies. 

The analysis is done in two steps: In a first step we prove by a simple geometric 
argument that in the presence of gliding friction on the plane of support, a specific 
linear combination of L 3 , i.e. the projection of the angular momentum onto the 
vertical, and of L 3 , its projection onto the top’s symmetry axis, is a constant of the 
motion. On the basis of this conservation law and of the inequality for the moments 
of inertia, see (3.112) below, one shows that the inverted (spinning) position is 
energetically favorable compared to the upright position. 

In a second step one writes the equations of motion in a specific set of variables 
which is particularly well adapted to the problem and one analyzes the dynamical 
behavior as a function of time and the stability or instability of the solutions. 

We make the following assumptions: Let the top be a sphere whose mass distri- 
bution is inhomogeneous in such a way that the center of mass S does not coincide 
with its geometrical center Z. The mass distribution is axially symmetric, but not 
spherically symmetric, so that the moments of inertia referring to the axes perpen- 
dicular to the symmetry axis are equal, but differ from the third, I\ — h ^ h ■ By 
a suitable choice of the unit of length, the radius of the sphere is R — 1. The center 
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of mass is situated at a distance a from the center Z, with 0 < a < 1, as sketched 
in Fig. 3.27. There are three types of frictional force that act on the instantaneous 
point of support A in the plane: rolling friction which is active whenever the top 
rolls over the plane without gliding; rotational friction which acts when the top 
is spinning about a vertical axis about a fixed point in the plane; and gliding fric- 
tion which acts whenever the top glides over the plane of support. We assume 
that the support is such that the force due to gliding friction is much larger than 
those due to the other two kinds of friction. Indeed, it turns out that it is the glid- 
ing friction that is responsible for the inversion of the spinning top. Finally, for 
the sake of simplicity, we assume that during the initial phase in which we are 
interested the rotational energy 7’ rot is much larger than the gravitational energy 
U = mg(l — cos 6). 

3.18.1 Conservation Law and Energy Considerations 

The instantaneous velocity of the point of support A is the sum of the center of 
mass’s horizontal velocity (f \ , .s' 2 ) (i.e. the component parallel to the plane) and of 
the relative velocities which stem from changes of the Euler angles. From Fig. 3.27 
we deduce that a change of 0 , i.e. a rotation about the body-fixed 3-axis, causes a 
linear velocity of A in the plane whose magnitude is Vip — sin 9, while a change 
of the angle 0, i.e. a rotation about the space-fixed 3o-axis, causes a velocity whose 
magnitude is V 0 — 0 a sin 0 . Both act along the same direction in the plane, say 
h. In contrast, a change in the angle 9 gives rise to a velocity with magnitude 
vg = 9(1 — a cos 9) in the direction t perpendicular to h. Although it is not difficult 
to identify these directions in Fig. 3.27 it is sufficient for our discussion to know 
that the velocities related to 0 and to <P have the same direction, while that due 
to 9 is perpendicular to this direction. 
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The effect of friction is described phenomenologically as in (6.28) below by 
introducing dissipative terms R<p, Rip such that 

P 0 — —R0 , pp = —Rip , (3.108) 

(and an analogous equation for p$). As we know the canonical momenta p<p and 
pip are the projections L 3 and L 3 , respectively, of the angular momentum onto the 
3o-axis and the 3-axis, respectively. Therefore, Rp and Rip are external torques, 
equal to the cross product of the distance to the corresponding axis of rotation and 
the frictional force. As the force is the same in both of them, independently of its 
detailed functional dependence on the velocity, and as these torques are parallel, 
their ratio is equal to the ratio of the distances. 

Rep / Rp — a sin#/ sind = a . (3.109) 

As a consequence, while both pip = L 3 and pip = L 3 decrease with time, the 
specific linear combination Z .3 — a /,3 = 0 vanishes. This yields the integral of the 
motion 

X Lj — aLi = constant. (3.110) 

We note in passing that this conservation law, which provides the key to an un- 
derstanding of the tippe top, has an amusing history that can be traced back 3 to 
1872! In fact, the quantity X is the projection of the angular momentum L onto 

the vector a =AS. Indeed, from (3.86) and (3.87) we have 
X — L- (e 3 0 — arj ) —La, where rj = R -1 )/)^ . 

Suppose the top is launched such that the conserved quantity (3.110) is large 
in the following sense 

X » pfmgT\ , (3.111) 

with m the mass of the top. Suppose, furthermore, that the mass distribution is 
chosen so that the moments of inertia fulfill the inequalities 

(1 - a)h < h < (1 + a)h • (3.112) 

The first of these means that the gravitational energy can be neglected in compar- 
ison to the rotational energy; the second assumption (3.112) implies that the top 
has lower energy when it rotates in a completely inverted position ( S above Z) 
than when it rotates in its normal, upright position. 

When the top has stopped gliding, its center of mass having come to rest, but 
rotates about a vertical axis in a (quasi) stationary state, we can conclude that 

3 St. Ebenfeld, F. Scheck: Ann. Phys. (New York) 243, 195 (1995). Note that the vector a equals 
—a in this reference and that the choice of convention for the rotation R (f), while consistent 
with earlier sections of this chapter, is the inverse of the one employed there. 
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j I= j 2 = 0, (9 = 0, 0+a<P = 0; (3.113) 

with s(t ) denoting the trajectory of the center of mass. 

Inserting I\ — h ^ h into the expression (3.85) for the kinetic energy of 
rotation one finds 

Trot = \h (d 2 + 0 2 sin 2 d^+ l -I 3 {^ + 0cos0) 2 . 

From this follow the generalized momenta 

L 3 = p$ — -^-r- — <P (^I\ sin 2 6 + I 3 cos 2 9^j + 1^0 cos 0 , (3.1 14a) 

- dL .... 

L 3 = pip = — It, (0 + 0 cos 6 ) . (3.114b) 

Inserting here the second and third of conditions (3.113), the rotational energy 
becomes 



Trot = \f(z)0 2 with F(z = cos 9) — 7i(l - z 2 ) + h(z - a ) 2 . 

The third of conditions (3.113), when inserted in (3.114a) and (3.114b), allows 
one to re-express the constant of the motion X in terms of the same function, viz 

X = 0 (^I[ sin 2 0 + 73 (cos 0 — a ) 2 ^ = 0F(z ) , 

so that the kinetic energy can be written in terms of A and the function F, 




X 2 

2F(z) ' 



(3.115) 



The rotational energy assumes its smallest value when F(z) takes its largest 
value. With the assumption (3.112) this happens, in the physical range of 9, for 
Z = — 1, i.e. 9 — tt. As the function F(z) increases monotonically in the interval 
[ 1 , — 1 ], the top’s rotation in the completely inverted position is favored energeti- 
cally over rotation in the upright position. 



3.18.2 Equations of Motion and Solutions with Constant Energy 

Assuming that the forces due to rotational and rolling friction can be neglected, 
the possible asymptotic states of the spinning top are clearly those in which glid- 
ing friction has ceased to be active. These asymptotic states have constant energy. 
Except for the trivial state of rest, they can only be one of the following: Rotation 
in the upright or in the completely inverted position, or rotation about a nonver- 
tical direction, changing with time, whereby the top rolls over the plane without 
gliding. Let us call the former two rotational, and the latter tumbling motion. 

The simple energy consideration of Sect. 3.18.1 leaves unanswered a number of 
important questions. Given the moments of inertia I\ and It,, which of the allowed 
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asymptotic states are stable ? In the case where an asymptotic state is stable, which 
initial conditions (i.e. when launching the top) will develop into that state under 
the action of gliding friction? Finally, in what way will simple criteria such as 
(3.112) be modified when the gravitational force is taken into account? 

In fact, these questions touch upon the field of qualitative dynamics, which 
is the subject of Chap. 6 (cf. in particular, the notion of Liapunov stability). A 
complete analysis of this dynamical problem can be found in the reference just 
given in footnote 3, an article that should be accessible after having studied Chap. 6. 
Here we confine ourselves to constructing the equations of motion in a form that 
is well adapted to the problem, and to report the most important results of the 
analysis. 

As described in Sect. 3.9 it is useful to introduce three frames of reference: 
the space-fixed inertial system K, the system Ko which is centered on the center 
of mass and whose axes are parallel, at all times, to those of K, and a body-fixed 
system K whose 3-axis is the symmetry axis of the top. In writing down the inertia 
tensor we make use of the following symbolic notation: We write any vector a as 
| a) and the object which is dual to it as (a |. An expression of the form ( b\a ) is 
then just another way of writing the scalar product b ■ a, while \b)[a\ is a tensor 
which, when applied to a third vector c yields a vector again, viz. 

\b)[a\c) — (a • c)b . 



In this notation the inertia tensor with respect to the system K reads 

j = /j n+ — 1^3 1 j , 

while its inverse reads 4 



= — 11 - 

h \ 



?3 It i 



Hence, in the frame of reference Ko it has the form 

J (0 = h (3 - 116a) 

where rj is the representation of the unit vector with respect to Ko, i.e. 

J/ = R“ 1 (I)^3 = R 7 (0^3 • 

An analogous formula holds for its inverse 
~-i 11 h- 1\ 1 

J (t)=- 1-^— ~\rj){ri\ • (3.116b) 

n *3 J 

4 In order to become familiar with this notation and calculus the reader should verify that (£t|J| = 
diag (/j, I\, It,), and that J -1 is indeed the inverse of J. 




234 



3. The Mechanics of Rigid Bodies 



The angular velocity a> may be taken from (3.56b). Alternatively, making use 
of (3.116b), it may be expressed in terms of the angular momentum L — J ■ co: 



1 

co(t) = — 

h 



L(t) - I ^ 1 -^\rj)(ij\L)^ . 



(3.117) 



The time derivative of rj follows from (3.56a), the time derivative of L is given 
by the external torque N (with respect to the system Ko), and the acceleration s(t) 
of the center of mass is given by the resulting external force F (in the system K). 
Therefore, the equations of motion are 



d A 1 

— Y] = (t) X H] = — L X Yj , 

dr I\ 


(3.118a) 


— L = N(rj,L,s), 
dr 


(3.118b) 


ms = F(9j, L, s) . 


(3.118c) 



(We recall that s(t) is the trajectory of the center of mass and s its velocity in the 
space-fixed system K.) 

If we demand that the top remain on the plane at all times (no bouncing), 
then the 3-component S 3 of the center of mass coordinate is not an independent 
variable. Indeed, the condition is that both the 3-coordinate of the point A and the 
3-component of its velocity v = s — (o x a are zero at all times. One easily shows 
that this implies the condition 

i 3 + ^ 3 0 |Lxij) = 0, (3.119) 

which in turn expresses i '3 in terms of iy and L. The third equation of motion 
(3.118c) must be replaced with 



ms 1,2 = Pri ,2 F , 

where the right-hand side denotes the projection of the external force F onto a 
horizontal plane parallel to the plane of support. 

The external force F acting on the center of mass S is the sum of the grav- 
itational force F g = — mge 3 , the normal force F n = g„e 3 , and the frictional 
force Ff r — —g{ T v. In contrast, the point A, being supported by the plane, expe- 
riences only the normal force and the frictional force, F (A> = F n + F f r , so that 
the external torque is given by 

N — —a x F {A) = ( oly\ - e^) x (g n e 3f) - g f[ v) . 

This leads immediately to the final form of the equations of motion 



d . 1 


— Y) — U) X Yj = — L X Yj , 

dr Ii 


(3.120a) 


d 


— L = (aYj - e~ 3o ) x (g n e~ 3o - g b v ) , 


(3.120b) 


ms 12 = -gfrb ■ 


(3.120c) 
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The coefficient g n in the normal force follows from the equation .V3 = —g + 
gn/m if one calculates the left-hand side by means of (3.119). For this one must 
take the orbital derivative of (3.119) which means replacing the time derivatives 
of L and of ij by (3.120b) and (3.120a), respectively. The result reads 



mg 1 1 [1 + a(r] 3 L 2 - L 3 L 3 )/(gI f )] 

Ii + ma 2 ( 1 — P3) + map. j\??3 — a)e^ Q — (1 — cn^iyj • v 



(3.121) 



Here 173 = i) ■ £3 is the projection onto the vertical. Regarding the frictional force 
we have assumed gf r = pg n , with p a (positive) coefficient of friction. 

Equations (3.120a-c) provide a good starting point for a complete analysis of 
the tippe top. One the one hand they are useful for studying analytic properties of 
the various types of solutions; on the other hand they may be used for a numerical 
treatment of specific solutions (cf. practical example 2 below). Here we report on 
some of the results, taken from the work quoted in footnote 3, and refer to that 
article for further details. 

(i) Conservation law: It is easy to verify that the conservation law (3.110) also 
follows from the equations of motion (3.120a). The orbital derivative of X (i.e. the 
time derivative taken along orbits of the system by making use of the equations 
of motion) is given by 

dk dL d<r 

— = — • cf T L • — . 

df dr dr 

The second term vanishes because, on account of (3.120a), da /dr = — adrj/dt is 
perpendicular to L. The first term vanishes because the torque N is perpendicular 
to a. 

(ii) Asymptotic states: The asymptotic states with constant energy obey the equa- 
tions of motion (3.120a) with v — 0, the second and the third equation being 
replaced by 



Uiv /v /v 

— = ag n r] x e 3g , ms 1,2 = 0, 
while (3.121) simplifies to 



gn = mg 



1 + ry(>73 L 2 ~ L 3 L 3 )/(glf) 
1 + mcr 2 (l — 



The solutions of constant energy have the following general properties: 



(a) The projections L 3 and L 3 of the angular momentum L onto the vertical and 
the symmetry axes, respectively, are conserved. 

(b) The square of the angular momentum L 1 as well as the projection i) ■ ey. = r ; 3 
of rj onto the vertical are conserved. 

(c) At all times the vectors e 3 , i/, and L lie in a plane, 

(d) The center of mass remains fixed in space, s = 0. 
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The types of motion that have constant energy have either 773 = + 1 (rotation in 
the upright position), or 773 = — 1 (rotation with complete inversion), or, possibly, 
— 1 < 773 < + 1 if these are allowed. The latter are tumbling motions whereby the 
top simultaneously rotates in an oblique (time dependent) orientation and rolls over 
the plane without gliding. Whether or not tumbling motion is possible depends on 
the choice of the moments of inertia. 



(iii) When does the spinning top turn upside-down? The general and complete 
answer to the question of which asymptotic state is reached from a given initial 
condition would occupy too much space. Here, we restrict ourselves to an example 
which corroborates the results of Sect. 3.18.1. For a given value of the constant 
of the motion (3.110) we define the following auxiliary quantities 



A := / 3 (1 -«)-/, + ^^(1 - a ) 4 , 

m & al , „,\4 



B /3(1 + a) — I\ — 



X 2 



-d+«r 



A detailed analysis of orbital stability (a so-called Liapunov analysis) for this exam- 
ple yields the following results: If A > 0 the state with 773 = +1 is asymptotically 
stable and the top will rotate in the upright position. If, however, A < 0 this state is 
unstable. Furthermore, if B > 0 then a state with 773 = —1 is asymptotically stable; 
if B < 0 then it is unstable. Whenever X is sufficiently large, cf. (3.111), the third 
terms in A and B can be neglected. The two conditions A < 0 and B > 0, taken 
together, then yield the inequalities (3.112). In this situation rotation in the upright 
position is unstable, whereas rotation in a completely inverted position is stable. 
No matter how the top is launched initially, it will always turn upside-down. This 
is the genuine “tippe top”. In the examples (iv) and (v) of Sect. 3.6 two simple 
models for such a top were described. 

The other possible cases can be found in the reference quoted above. Here is 
what one finds in case the initial rotation is chosen sufficiently fast (i.e. if X is 
large in the sense of (3.111)): 



(a) For 1 1 < 73(1 — a) both, rotation in the upright position and rotation in the 
inverted position, are stable. There also exists a tumbling motion (with constant 
energy) but it is unstable. This top could be called indifferent because, depending 
on the initial condition, it can tend either to the upright or to the inverted position. 



(b) For 1 1 > 73 (1 + a) the two vertical positions are unstable. There is exactly 
one state of tumbling motion (i.e. rotating and rolling without gliding) which is 
asymptotically stable. Every initial condition will quickly lead to it. 



Appendix: Practical Examples 

1. Symmetric Top in a Gravitational Field. Study quantitatively the motion of a 
symmetric spinning top in the earth’s gravitational field (a qualitative description 
is given in Sect. 3.16). 
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Solution. It is convenient to introduce dimensionless variables as follows. For the 
energy E' (3.100) take 



s= E'/Mgl . 



(A.l) 



Instead of the projections L 3 and L 3 introduce 



def 
A — 



l?, 



I[ Mgl 



- def 
A — 



Li 



I [Mgl 



(A. 2) 



The function f(u) on the right-hand side of (3.104) is replaced with the dimen- 
sionless function 



. , /' 

<p(u) = — — f(u) — 2(1 — u 2 )(s + 1 — u) — (X — Xu) 2 
Mgl v ’ 



(A. 3) 



As one may easily verify, the ratio I[/Mgl has dimension (time) 2 . Thus, 



/ Mgl/ 1[ is a frequency. Finally, using the dimensionless time variable 



def 



r = cot, (3.104) becomes 
/ d u V 

u =*■<«>. 



(A. 4) 



Vertical rotation is stable if L\ > 4MglI[, i.e. if X > 2. The top is vertical if/, = X. 
With u — > 1 , the critical energy, with regard to stability, is then e cr it.(A = X) = 0. 
For the suspended top we have X — —X, u — » 1, and the critical energy is 



Scrit.(k — X) — 2 . 



The equations of motion now read 

(A. 5) 



(A. 6) 

Curve A of Fig. 3.25 corresponds to the case of a suspended top, i.e. X — — X and 
MO = — 1. We have chosen e = 0, X = 3.0. Curve C corresponds to the vertical top, 
and we have chosen s = 2, X — X — 5. Curve B, finally, describes an intermediate 
situation. Here we have taken s — 2, X — 4, X — 6. 

The differential equations (A. 5) and (A. 6) can be integrated numerically, e.g. 
by means of the Runge-Kutta procedure described in Practical Example 2.2. For 
this purpose let 



/dw\ : (0 (h) 

UJ =viu) or UJ 



and 



d0 -Mo — m 
dr 1 — m 2 



L 3 X 

with m 0 — — — — 

L 3 X 
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/ d(9 

v= dT 
d<Z> 

y~dr 

and read (A. 11) and (A. 12) of the Appendix to Chapter 2 as equations with two 
components each. This allows us to represent the motion of the axis of symmetry in 
terms of angular coordinates (6, 0) in the strip between the two parallels defined 
by u i and U 2 - It requires a little more effort to transcribe the results onto the unit 
sphere and to represent them, by a suitable projection, as in Fig. 3.26. 

2. The Tippe Top. Under the assumption that the coefficient of gliding friction 
is proportional to g n , the coefficient that appears in the normal force, numerically 
integrate the equations of motion (3.120a-c) with the three possible choices for 
the moments of inertia. 

Solution. The assumption is that gf r = fig n . Let v = ||v|| be the modulus of the 
velocity. In order to avoid the discontinuity at v — 0 on the right-hand side of 
(3.120c) one can replace v by 

it i — > tanh(M||u||) , 

|| v || 

where M is a large postive number. Indeed, the factor tanh(M||»||) vanishes at 
zero and tends quickly, yet in a continuous fashion, to 1. It is useful to introduce 
appropriate units of length, mass, and time such that R = 1, m = 1 and g = 1. 
Furthermore, the coefficient of friction [x should be chosen sufficiently large, say 
H — 0.75, so that the numerical solutions quickly reach the asymptotic state(s). 
Compare your results with the examples given by Ebenfeld and Scheck (1995) 
footnote 3. 




4. Relativistic Mechanics 



Mechanics, as we studied it in the first three chapters, is based on two fundamental 
principles. On the one hand one makes use of simple functions such as the La- 
grangian function and of functionals such as the action integral whose properties 
are clear and easy to grasp. In general, Lagrangian and Hamiltonian functions do 
not represent quantities that are directly measurable. However, they allow us to 
derive the equations of motion in a general and simple way. Also, they exhibit 
the specific symmetries of a given dynamical system more clearly than the equa- 
tions of motion themselves, whose form and transformation properties are usually 
complicated. 

On the other hand, one assumes a very special structure for the space-time 
manifold that supports mechanical motion. In the cases discussed up until now 
the equations of motion were assumed to be form-invariant with regard to general 
Galilei transformations (Sect. 1.13; see also the discussion in Sect. 1.14). This im- 
plied, in particular, that Lagrangian functions, kinetic and potential energies, had 
to be invariant under these transformations. 

While the first “building principle” is valid far beyond nonrelativistic point 
mechanics (provided one is prepared to generalize it to some extent, if necessary), 
the validity of the principle of Galilei invariance of kinematics and dynamics is 
far more restricted. True, celestial mechanics as well as the mechanics that we 
encounter in daily life when playing billiards, riding a bicycle, working with a 
block-and-tackle, etc., is described by the Galilei-invariant theory of gravitation to 
a very high accuracy. However, this is not true, in general, for microscopic objects 
such as elementary particles, and it is never true for nonmechanical theories such 
as Maxwell’s theory of electromagnetic phenomena. Without actually having to 
give up the general, formal framework altogether, one must replace the principle of 
Galilei invariance by the more general principle of Lorentz or Poincare invariance. 
While in a hypothetical Galilei-invariant world, particles can have arbitrarily large 
velocities, Poincare transformations contain an upper limit for physical velocities: 
the (universal) speed of light. Galilei-invariant dynamics then appears as a limiting 
case, applicable whenever velocities are small compared to the speed of light. 

In this chapter we learn why the velocity of light plays such a special role, in 
what way the Lorentz transformations follow from the universality of the speed 
of light, and how to derive the main properties of these transformations. Today, 
basing our conclusions on a great amount of experience and increasingly precise 
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experimental information, we believe that any physical theory is (at least locally) 
Lorentz invariant 1 . Therefore, in studying the special theory of relativity, within the 
example of mechanics, we meet another pillar on which physics rests and whose 
importance stretches far beyond classical mechanics. 

4.1 Failures of Nonrelativistic Mechanics 

We wish to demonstrate, by means of three examples, why Galilei invariant me- 
chanics cannot be universally valid. 

(i) Universality of the speed of light. Experiment tells us that the speed of light, 
with respect to inertial systems of reference, is a universal constant. Its value is 

c = 2.997 92458 x 10 s ms -1 . (4.1) 

Our arguments of Sect. 1.14 show clearly that in Galilei-invariant mechanics a uni- 
versal velocity and, in particular, an upper limit for velocities cannot exist. This 
is so because any process with characteristic velocity v, with respect to an inertial 
system of reference Kj, can be observed from another inertial system K/>, moving 
with constant velocity w relative to K] . With respect to K 2 , the process then has 
the velocity 

v' — v + w , (4.2) 

in other words, velocities add linearly and therefore can be made arbitrarily large. 

(ii) Particles without mass carry energy and momentum. In nonrelativistic mechan- 
ics the kinetic energy and momentum of a free particle are related by 




In nature there are particles whose mass vanishes. For instance, the photon (or light 
quantum), which is the carrier of electromagnetic interactions, is a particle whose 
mass vanishes. Nevertheless, a photon carries energy and momentum (as proved by 
the photoelectric effect, for example), even though relation (4.3) is meaningless in 
this case: neither is the energy E infinite when \p\ is finite nor does the momentum 
vanish when E has a finite value. In the simplest situation a photon is characterized 
by a circular frequency <x> and a wavelenght '/. that are related by coX — 2itc. If 
the energy E y of the photon is proportional to o> and if its momentum is inversely 
proportional to X, then (4.3) is replaced with a relation of the form 

Ty = Ey — a\p\c , (4.4) 



1 



Space inversion P and time reversal T are excepted because there are interactions in nature that 
are Lorentz invariant but not invariant under P and under T. 




4.1 Failures of Nonrelativistic Mechanics 



241 



where the index y is meant to refer to a photon and where a is a dimensonless 
number (it will be found to be equal to 1 below). Furthermore, the photon has 
only kinetic energy, hence E y (total energy) = T y (kinetic energy). 

Further, there are even processes where a massive particle decays into several 
massless particles so that its mass is completely converted into kinetic energy. For 
example, an electrically neutral 7 r meson decays spontaneously into two photons: 

7T° (massive) — > y + y (massless) , 

where m(jT°) — 2.4 x 10 -28 kg. If the : r° is at rest before the decay, the momenta 
of the two photons are found to add up to zero, 

pf + P? = 0 , 

while the sum of their energies is equal to times the square of the speed 

of light, 

T y l) + T y 2) = c(\p y ] \ + \p y } \) = m(7t°)c 2 . 

Apparently, a massive particle has a finite nonvanishing energy, even when it 
is at rest: 



E(p = 0) = me 2 , (4.5) 

This energy is said to be its rest energy. Its total energy, at finite momentum, is 
then 

E(p) = me 2 + T(p) , (4.6) 

where T(p). at least for small velocities \p\/m «. c, is given by (4.3), while for 
massles particles (m = 0) it is given by (4.4) with a — 1 . 

Of course, one is curious to know how these two statements can be reconciled. 
As we shall soon learn, the answer is provided by the relativistic energy-momentum 
relation 

E(p) — yj '(me 2 ) 2 + p 2 c 2 , (4.7) 

which is generally valid for a free particle of any mass and which contains both 
(4.3) and (4.4), with a = 1. If this is so the kinetic energy is given by 



T(p) = E(p)~ me 2 




(4.8) 



Indeed, for m = 0 this gives T = E = \p\c, while for m ^ 0 and for small 
momenta \p\/m c 



Tip) ~ me 2 



1 + 



1 p 2 c 2 

2 (me 2 ) 2 





(4.9) 



which is independent of the speed of light c! 
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(iii) Radioactive decay of moving particles. There are elementary particles that are 
unstable but decay relatively “slowly” (quantum mechanics teaches us that this is 
realized when their lifetime is very much larger than Planck’s constant divided by 
the rest energy, r h/lnmc 2 ). Their decay can then be studied under various 
experimental conditions. As an example take the muon //, which is a kind of heavy, 
and unstable, electron. Its mass is about 207 times larger than the mass of the 
electron 2 , 

m(n)c 2 = 206.77m (e)c 2 . (4.10) 

The muon decays spontaneously into an electron and two neutrinos (nearly mass- 
less particles that have only weak interactions), 

tx ->• e + vi + v 2 . (4.11) 

If one stops a large number of muons in the laboratory and measures their lifetime, 
one gets 3 

r (O) 0 ix) = (2.19703 ± 0.00004) x 10“ 6 s . (4.12) 

If one performs the same measurement on a beam of muons that move at constant 
velocity v in the laboratory, one gets 

r ^Hfx) = yr <0, (/x) , where y — E/mc 2 = (1 — j> 2 /c 2 ) -1 ^ 2 . (4.13) 

For example, a measurement at y — 29.33 gave the value 

r (U V) = 64.39 x 10“ 6 s ~ 29.3r (0) (/u) . 

This is an astounding effect: the instability of a muon is an internal property of the 
muon and has nothing to do with its state of motion. Its mean lifetime is something 
like a clock built into the muon. Experiment tells us that this clock ticks more 
slowly when the clock and the observer who reads it are in relative motion than 
when they are at rest. Relation (4.13) even tells us that the lifetime, as measured 
by an observer at rest, tends to infinity when the velocity | v | approaches the speed 
of light. 

If, instead, we had applied Galilei-invariant kinematics to this problem, the 
lifetime in motion would be the same as at rest. Again, there is no contradiction 
with the relativistic relationship (4.13) because y ~ 1 + v 2 /2c 1 . For r c the 
nonrelativistic situation is realized. 



2 These results as well as references to the original literature are to be found in the Review of 
Particle Properties, Physics Letters B592 (2004) 1 and (on the web) http://pdg.lbl.gov. 

3 J. Bailey et al„ Nucl. Phys. B 150 (1979) 1. 




4.2 Constancy of the Speed of Light 



243 



4.2 Constancy of the Speed of Light 

The starting point and essential basis of the special theory of relativity is the fol- 
lowing experimental observation that we formulate in terms of a postulate: 



Postulate I. In vacuum, light propagates, with respect to any inertial system 
and in all directions, with the universal velocity c (4.1). This velocity is a 
constant of nature. 



As the value of the speed of light c is fixed at 299 792458 ms -1 and as there are 
extremely precise methods for measuring frequencies, and hence time, the meter is 
defined by the distance that a light ray traverses in the fraction 1/299 792458 of a 
second. (This replaces the standard meter, i.e. the measuring rod that is deposited 
in Paris.) 

The postulate is in clear contradiction to the Galilei invariance studied in 
Sect. 1.13. In the nonrelativistic limit, two arbitrary inertial systems are related 
by the transformation law (see (1.32)) 

x' — Rx + wt + a , 

t' = \t + s, (A. = ±l), (4.14) 

according to which the velocities of a given process, measured with respect to 
two different intertial systems, are related by v' — v + w. If Postulate I is correct, 
(4.14) must be replaced with another relation, which must be such that it leaves 
the velocity of light invariant from one inertial frame to another and that (4.14) 
holds whenever | u | <§; c* holds. 

In order to grasp the consequences of this postulate more precisely, imagine the 
following experiment of principle. We are given two inertial systems K and K'. Let 
a light source at position xa emit a signal at time t a , position and time coordinates 
referring to K. In vacuum, this signal propagates in all directions with constant 
velocity c and hence lies on a sphere with its center at x a- If we measure this 
signal at a later time tg > l a, at a point x g in space, then obviously \x g — x a \ — 
c(tg — tA ), if we take the squares, 

(X B - xa) 2 - C 2 (tB - t A ) 2 = o . (4.15) 

Points with coordinates (x, t) for which one indicates the three spatial coordi- 
nates as well as the time at which something happens at x (emission or detection 
of a signal, for instance) are called world points or events. Accordingly, the prop- 
agation of a signal described by a parametrized curve (x(t). t) is called a world 
line. 

Suppose the world points (X/t , t A ), (xg, tg) have the coordinates (x' A , t' A ) and 
(x'g, t' B ), respectively, with regard to the system K'. Postulate I implies that these 
points must be connected by the same relation (4.15), i.e. 

(x'g - x' A ) 2 - c 2 (t' B - t' A ) 2 = 0 
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with the same, universal constant c. In other words the special form 

z 2 — (z 0 ) 2 = 0 , (4.16) 

relating the spatial distance |z| = x /,■ — x a of two world points A and B to the 
difference of their time coordinates z° = c(t B — f a ) , must be invariant under all 
transformations that map inertial systems onto inertial systems. In fact, we confirm 
immediately that there are indeed subgroups of the Galilei group that leave this 
form invariant. These are 

(i) translations t' = t + s and x' = x + a, and 

(ii) rotations t' = t and x' = Rx. 

This is not true, however, for special Galilei transformations, i.e. in the case where 
the two inertial systems move relative to each other. In this case (4.14) reads t' — t, 
x' = x + wt, so that (x' A — x' B ) 2 = (xa — xb + w(tA — (b)) 2 , which is evidently 
not equal to (xa — x /;) 2 . What is the most general transformation 

(f,x)->(f',x') (4.17) 

A 

that replaces (4.14) and is such that the invariance of the form (4.16) is guaranteed? 



4.3 The Lorentz Transformations 



In order to unify the notation let us introduce the following definitions: 

o def , 

X — Ct , 

/I 2 3> def 

(X , X , X ) = X . 

It is customary to denote indices referring to space components only by Latin 
letters i, j, k, . . . .If one refers to space and time components, without distinction, 
one uses Greek letters fi,v,Q, . . . instead. Thus 

x> 1 : p — 0, 1, 2, 3 denotes the world point (x° = ct, x 1 , x 2 , x 3 ) , and 

x' : i — 1,2,3 denotes its spatial components. 

One also writes x for a world point and x for its spatial part so that 
x» — (x°, x) . 



Using this notation (4.15) reads 
(x° B - x°) 2 - (x B - xa) 2 = 0 . 

This form bears some analogy to the squared norm of a vector in n -dimensional 
Euclidean space R", which is written in various ways: 

n n n 

x\ = = (x ’ • 

i= 1 i = 1 k= 1 



(4.18) 
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(The index E stands for Euclidean.) The Kronecker symbol 8a is a metric tensor 
here. As such it is invariant under rotations in R", i.e. 

R r <SR = 8 . 



A well-known example is provided by R , the three-dimensional Euclidean space 
with the metric 



'1 0 0 ^ 

Sik = I 0 1 0 

to 0 1 , 



In four space-time dimensions, following the analogy with the example above, 
we introduce the following metric tensor: 



g»v = 8^ = 



/I 0 0 0\ 
0-10 0 
0 0-10 
v 0 0 0 — ly 



(4.19) 



This enables us to write the invariant form (4.15) as follows: 



3 3 

J2 J2^ x b - A)8hv(x v b - x v A ) = 0 . 



/i=0 v=0 



(4.20) 



Before we move on, we wish to stress that the position of (Greek) indices 
matters: one must distinguish upper (or contravariant ) indices from lower (or co- 
variant) indices. For instance, we have 

— (x°, x) , (4.21a) 

(by definition), but 

3 

XX d = ^2 gx^ = (x°, -x) . (4.21b) 

fi=Q 

For example, the generalized scalar product that appears in (4.20) can be written 
in several ways, viz. 

(z, z) = (z 0 ) 2 - z 2 = Y2 z>l g^z v = ^2 ~ 1 2 ZvzV ' ( 4 -22) 

/IV /IV 

Note that the indices to be summed always appear in pairs, one being an upper 
index and one a lower index. As one can sum only a covariant and a contravariant 
index, it is useful to introduce Einstein's summation convention , which says that 
expressions such as A a B a should be understood to be 

3 

J2 A " B “ ■ 

a=0 
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Remarks: The bra and ket notation that we used in Chap. 3 is very useful in the 
present context, too. A point x of M 4 , or likewise a tangent vector a — ( ao , a) T , 
is represented by a four-component column, 

\x) = (x°,x) 7 , |a> = (a°,a) J . 

Objects which are dual to them are written as row vectors but contain the minus 
sign that follows from the metric tensor g = diag (1, —1), 

(>'l = ( y ° . -y) , (b\ = (b°, - b ) . 

Taking scalar product in the sense of multiplying a 1 x 3-matrix and 3 x 1 -matrix 
yields the correct answers 

(y\x) = y°x° - y ■ x , 

(b\a) — b°a° — b a , 

for the Lorentz invariants which can be formed out of them. The “bra-ket” notation 
emphasizes that ( b\ is the dual object that acts on | a), very much in the spirit of 
linear algebra. 

The metric tensor defined in (4.19) has the following properties: 

(i) It is invariant under the transformations (4.17). 

(ii) It fulfills the relations g a pg^ y = Sj , where Sj is the Kronecker symbol, 
and g ap = g afl g^ v g v p = g ap . 

(iii) Its determinant is det g = — 1 . 

(iv) Its inverse and its transpose are g -1 = g = g T . 

The problem posed in (4.17) consists in constructing the most general affine trans- 
formation 

x* — > x'* : x' 1 * = A»x a + a » (4.23) 

(A, a) 

that guarantees the invariance of the form (4.15). Any such transformation maps 
inertial frames onto inertial frames because any uniform motion along a straight 
line is transformed into a state of motion of the same type. 

Inserting the general form (4.23) into the form (4.15) or (4.20), and in either 
system of reference K or K': 

(Xg - x^gfivixg - xD = 0 = (xg - x'%)g a p(Xg - x'P) , 

we note that the translational part cancels out. As to the homogeneous part A, 
which is a 4 x 4 matrix, we obtain the condition 

A ^i8atA T v = agfj, v (4.24) 

where a is a real positive number that remains undetermined for the moment. In 
fact, if we decide to write x as a shorthand notation for the contravariant vector 
and A instead of A , 
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* = {x 11 } , A = {A»} , 

then (4.23) and (4.24) can be written in the compact form 

x' — Ax + a , (4.23') 

A T gA = ag . (4.24') 

Here x is a column vector and A is a 4 x 4 matrix, and we use the standard rules 
for matrix multiplication. For example, let us determine the inverse A -1 of A, 
anticipating that a — 1. It is obtained from (4.24') by multiplying this equation 
with g -1 = g from the left: 

A” 1 = gA T g . (4.25) 

Writing this out in components, we have 

(A-h“, = *“^M = ^“). 

a matrix that is sometimes also denoted by Ajf. 

Because g is not singular, (4.24) implies that A is not singular. Indeed, from 
(4.24), 

(detA) 2 = a 4 . 

What do we know about the real number a from experience in physics? To answer 
this question let us consider two world points (or events) A and O whose difference 

def 

z = xa—xo does not necessarily fulfill (4.15) or (4.16). Defining their generalized 

def A q q 

distance to be d — (z ) — (z) , we calculate this distance with respect to the 
inertial system K': 

d! d = (z'°) 2 - (z') 2 = «[(z°) - (z) 2 ] = ad . 

Taking, for example, rotations in R 3 that certainly fulfill (4.15), we see that this 
means that the spatial distance Vz 2 , as measured from the second system of refer- 
ence, appears stretched or compressed by the factor *Ja. More generally, any spa- 
tial distance and any time interval are changed by the factor ^/a, when measured 
with respect to K', compared to their value with respect to K. This means either that 
any dynamics and any equation of motion that depend on spatial distances and on 
time differences differ in a measurable way in different frames of reference or that 
the laws of nature are invariant under scale transformations x ^ -> x'^ = «Jax^ . 

The first possibility is in contradiction with the Galilei invariance of mechanics. 
This invariance, which is well confirmed by experiment, must hold in the limit of 
small velocities. The second possibility contradicts our experience, too: the laws 
governing the forces of nature, as far as they are known to us, contain parame- 
ters with dimension and are by no means invariant under scale transformations of 
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spatial and/or time differences. In fact, this is the main reason we choose the trans- 
formation (4.23) (which is still to be determined) to be an affine transformation. 
In conclusion, experience in physics suggests we take the constant a to be equal 
to 1, 

a = 1 . (4.26) 

Another way of formulating this conclusion is by the following: 



Postulate II. The most general affine transformation x i-> x' = Ax + a, 
y i-^ y' — Ay + a must leave invariant the generalized distance z 2 — 
(z 0 ) 2 — z 2 (where z = y — jc), independent of whether z 2 is zero or not. 



This postulate, which is based on experience, can be obtained in still another way. 
Our starting point was the notion of inertial frame of reference, with respect to 
which free motion (i.e. motion without external forces) proceeds along a straight 
line and with constant velocity. In other words, such a frame has the special prop- 
erty that dynamics, i.e. the equations of motion, take a particularly simple fom. The 
class of all inertial frames is the class of reference frames with respect to which 
the equations of motion have the same form. By definition and by construction 
the transformation x' = Ax + a (4.23) maps inertial frames onto inertial frames. 
As the dynamics is characterized by quantities and parameters with dimensions 
and as it is certainly not scale invariant, since, furthermore. Postulate I must hold 
true, transformations (4.23) must leave the squared norm z 2 = Z^gfivZ v invariant. 
Postulate II already contains some empirical information: very much as in nonrela- 
tivistic mechanics, lengths and times are relevant, as well as the units that are used 
to measure them and that are compared at different world points. The following 
postulate is more general and much stronger than this. 



Postulate of Special Relativity. The laws of nature are invariant under the 
group of transformations {A, a). 



This postulate contains Postulate II. It goes far beyond it, however, because it 
says that all physical theories, not only mechanics, are invariant under the transfor- 
mations (A, a). Clearly, this is a very strong statement, which reaches far beyond 
mechanics. It holds true, indeed, also in the physics of elementary particles (space 
reflection and time reversal being excepted), at spatial dimensions of the order of 
10“ 15 m and below. In fact, special relativity belongs to those theoretical founda- 
tions of physics whose validity is best established. 

According of Postulate II the generalized distance of two world points x and 
y is invariant, with respect to transformations (4.23), even when it is nonzero: 



(>’ - x) 2 = (y“ - x a )g a p(yP - x p ) = invariant . 
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Fig. 4.1. Schematic representation of four-dimensional 
space-time. z (i is the time axis, z symbolizes the three 
space directions 



Note that this quantity can be positive, negative, or zero. This can be visualized 
by plotting the vector z = y — x in such a way that the spatial part z is represented, 
symbolically, by one axis (the abscissa in Fig. 4.1), while the time component z° is 
represented by a second axis, perpendicular to the first (the ordinate in the figure). 
The surfaces z 2 — const are axially symmetric hyperboloids, or if z 2 vanishes, 
a double cone, embedded in the space-time continuum. The double cone that is 
tangent, at infinity, to the hyperboloids, is called the light cone. Vectors on this 
cone are said to be lightlike ; vectors for which z 2 > 0, i.e. (z 0 ) 2 > z , are said 
to be timelike ; vectors for which z 2 < 0, i.e. (z 0 ) 2 < z 2 , are said to be spacelike. 
(These definitions are important because they are independent of the signature of 
the metric tensor g. We have chosen the signature (+, — , — , — ) but we could have 
chosen (— , +, +, +) as well.) In Fig. 4.1 the point A is timelike, B is lightlike, 
and C is spacelike. Considering the action of the transformation (A. a), we see 
that the translation ( II, a ) has no effect, since a cancels out in the difference y — 
x. The homogeneous part (A,0) shifts the points A and C on their respective 
paraboloids shown in the figure, while it shifts B on the light cone. We give here 
typical examples for the three cases: 

timelike vector (z°, 0 ) with z° = sfz 2 , (4.27a) 

spacelike vector (0, z^O, 0) with z 1 = \J — z 2 , (4.27b) 

lightlike vector (1,1,0, 0) (4.27c) 

These three cases can be taken to be the normal forms for timelike, spacelike, 
and lightlike vectors, respectively. Indeed, every timelike vector can be mapped, by 
Lorentz transformations, onto the special form (4.27a). Similarly, every spacelike 
vector can be mapped onto (4.27b), and every lightlike vector can be transformed 
into (4.27c). This will be shown below in Sect. 4.5.2. 

The world points x and y lie in a four-dimensional affine space. Fixing an 
origin (by choosing a coordinate system, for example) makes this the vector space 
R 4 . The differences (y — x) of world points are elements of this vector space. 
If we endow this space with the metric structure (4.19) we obtain what is 
called the flat Minkowski space-time manifold M 4 . This manifold is different, in 
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an essential way, from Galilei space-time. In the Galileian space-time manifold the 
statement that two events took place simultaneously is a meaningful one because 
simultaneity is preserved by Galilei transformations (however, it is not meaningful 
to claim that two events had happened at the same point in space , but at different 
times.) Absolute simultaneity i.e. the absolute character of time, as opposed to 
space, no longer holds with regard to Lorentz transformations. We return to this 
question in more detail in Sect. 4.7. 



4.4 Analysis of Lorentz and Poincare Transformations 

By definition, the transformations (A, a) leave invariant the generalized distance 
(x — y) 2 = (v° — y 0 ) 2 — (x — y) 2 of two world points. They form a group, 
the inhomogeneous Lorentz group (iL), or Poincare group. Before turning to their 
detailed analysis we verify that these transformations indeed form a group. 

1. The composition of two Poincare transformations is again a Poincare trans- 
formation: 

(A2, a2){A\.a\) — (A2A1 , A2fli + 02 ) ■ 

The homogeneous parts are formed by matrix multiplication, the translational 
part is obtained by applying A2 to a\ and adding aj. It is easy to verify that 
the product (A2A1) obeys (4.24) with a = 1. 

2. The composition of more than two transformations is associative: 

(A3, fl3)[(A2, fl2)(Al, flj)] = [(A3, a3)(A2, fl2)](Al, a\) , 

because both the homogeneous part A3A2A1 and the translational part 
A 2 A 2 O 1 + A3<72 + <23 of this product are associative. 

3. There exists a unit element, the identical transformation, which is given by 

E = (A = H,a = 0 ). 

4. As g is not singular, by (4.24), every transformation (A, a) has an inverse. It is 
not difficult to verify that the inverse is given by (A, a) -1 = (A -1 , — A _1 a). 

By taking the translational part to be zero, we see that the matrices A form 
a group by themselves. This group is said to be the homogeneous Lorentz group 
(L). The specific properties of the homogeneous Lorentz group follow from (4.24) 
(with a — 1). They are: 

1. (detA) 2 = 1. Because A is real, this implies that either detA = +1 or 
detA = —1. The transformations with determinant +1 are called proper 
Lorentz transformations. 

2. (A° 0 ) 2 > 1. Hence, either A° 0 > +1 or A° 0 < — 1. This inequality is obtained 
from (4.24) by taking the special values p — v = 0, viz. 

3 3 

A a 0 g aT A T 0 = (A° 0 ) 2 - £(A' 0 ) 2 = 1 , or (A° 0 ) 2 = 1 + £(A' 0 ) 2 . 

; = 1 i = 1 
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Transformations with A ° 0 > +1 are said to be orthochronous. They yield a 
"forward” mapping of time, in contrast to the transformations with A {] f) < — 1, 
which relate future and past. 

Thus, there are four types of homogeneous Lorentz transformations, which are 
denoted as follows: L + , L+, L_, L_. The index + or — refers to the property 
det A = + 1 and det A = — 1, respectively; the arrow pointing upwards means A ° 0 > 
+ 1, while the arrow pointing downwards means A ° 0 < — 1. Special examples for 
the four types are the following. 

* 

(i) The identity belongs to the branch L + : 



/I 0 0 0\ 
0 10 0 
0 0 10 
\0 0 0 1 / 



e L\ 



* 

(ii) Reflection of the space axes (parity) belongs to the branch L_: 



/l 



\ 



-1 



-1 



e L'_ 



V - 1 / 



(4.28) 



(4.29) 



(iii) Reversal of the time direction (time reversal ) belongs to ii: 



T = 



/-i 

V 




(4.30) 



(iv) The product PT 



of time reversal and space reflection belongs to L 



I. 

+ • 



/-I 



PT = 



e Li 



V 



(4.31) 



At this point, we wish to make a few remarks relevant to what follows. The four 
discrete transformations (4.28-31) themselves form what is called Klein’s group, 

{E, P.T. PT} . (4.32) 

Indeed, one can easily verify that the product of any two of them is an element 
of the group. 

It is also clear that two arbitrary transformations belonging to different branches 
cannot be made to coincide by continuous deformation. Indeed, as long as A is real, 
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transformations with determinant + 1 and those with determinant — 1 are separated 
discontinuously from each other. (Likewise, transformations with A (> () > +1 and 
with A° 0 < — 1 cannot be related by continuity). However, for given A e l\, 
we note that the product AP is in l 1, the product AT is in Z.£, and the product 

I A- 

A(PT ) is in L + . Thus, if we know the transformations belonging to L + (the proper, 
orthochronous Lorentz transformations), those pertaining to the other branches can 
be generated from them by multiplication with P, T, and (PT). These relations are 
summarized in Table 4.1. 



Table 4.1. The four disjoint branches of the homogeneous Lorentz group 



i£(detA = 1, A° 0 > 1) 

Examples : E , rotations, special 

Lorentz transformations 


Z.£(detA = 1, A° q < -1) 

Examples : PT, as well as all A(PT) 
with A e L + 


l[_ (det A = -1, A° 0 > 1) 

Examples : P, as well as all AP with A e L+ 


t£(detA = -1, A° 0 < -1) 

Examples : T, as well as all AT with A e L+ 



Finally, we conclude that the branch L + is a subgroup of the homogeneous 
Lorentz group. Indeed, the composition of two transformations of L' + is again 

T- 

element of L + . Furthermore, it contains the unit element as well as the inverse of 
any of its elements. This subgroup L + is called the proper, orthochronous Lorentz 
group. (In contrast to L' + , the remaining three branches are not subgroups.) 



4.4.1 Rotations and Special Lorentz Tranformations (“Boosts”) 



The rotations in three-dimensional space, well-known to us from Sect. 2.22, leave 
the spatial distance | jc — y| invariant. As they do not change the time component 
of any four-vector, the transformations 



A(R) = n 



def 






V 0 



0 0 0 \ 

R 



(433) 



with R e SO(3) leave invariant the form (z 0 ) 2 — (z). Thus, they are Lorentz 
transformations. Now, obviously 1Z° 0 = +1, and detlZ = detR = +1, so they 
belong to the branch L + . Thus, extending the rotations in three-dimensional space 
by adding a 1 in the time-time component, and zeros in the time-space and the 
space-time components, as shown in (4.33), we obtain a subgroup of L + . 

We now turn to the relativistic generalization of the special Galilei transfor- 
mations 



x’ — x — vt , 



t’ = t . 



(4.34) 
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Their relativistic counterparts are called special Lorentz transformations, or boosts. 
They are obtained as follows. 

As we know, boosts describe the situation where two inertial systems of ref- 
erence K and K' move relative to each other with constant velocity v. Figure 4.2 
shows the example of uniform motion along the spatial 1-axis, v = ve\. The space 
components that are transverse to the 1-axis are certainly not changed, i.e. 



Regarding the remaining components of the four-vector z, this implies that the 
form (z 0 ) 2 — (z 1 ) 2 must be invariant. 

(z 0 ) 2 - (z 1 ) 2 = (z° + Z^Cz 0 - Z 1 ) = invariant . 

Thus, we must have 

z'° + z' 1 = + z 1 ) , (z'° - z' 1 ) = 77— (z° - Z 1 ) 

f(v) 

with the conditions /( v) > 0 and lim f ( v ) = 1. Furthermore, the origin O' of K' 

i) — ^0 

moves with velocity v, relative to K. Thus, the primed and unprimed 1 -component 
of O' are, respectively, 




from which follow 

(f 2 - 1) + -(/ 2 + 1) = 0 , 

c 




v = v e. 



Fig. 4.2. K and K' are inertial systems that move at a constant velocity 
relative to each other. At t = 0 or. v = 0, the two systems coincide 
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and, finally, 



f(v) = 



1 - (v/c) 

1 + (v/c) 



(4.35) 



At this point let us introduce the following (universally accepted) abbreviations: 

def 1 



def |V| 
C 



y 






(4.36) 



Using this notation, z l and z° are seen to transform as follows: 

/0X 1/7 + 1// / - i/A (z 1 

2 f + i/f) 

y -yP 
-yP y 

where 0 < P < l, y > l. Including the 2- and 3-components, the special Lorentz 
transformation (boost) that we are out to construct reads 



+ ( ti) — L( v) v=v ^ l 



( Y -YP 0 0\ 

-yP y oo 

0 0 10 

v 0 0 0 1/ 



(4.37) 



It has the properties L° () > +1 and detL = +1, and therefore it belongs to lP + . 

Without loss of generality we could have parametrized the function f(v) (4.35) 
by 



f(v) = exp(— A.(u)) . 



(4.38) 



As we shall show below (Sect. 4.6), the parameter A is a relativistic generalization 
of the (modulus of the) velocity. For this reason it is called rapidity. Using this 
parametrization, the transformation (4.37) takes the form 



0 \ 
0 
0 

1 / 

where X and |v| are related by 
ltd 

tanh X = — = P ■ 

c 



L(-vei) — 



( cosh A — sinhA 0 
-sinhA cosh A 0 

0 0 1 

\ 0 0 0 



(4.39) 



(4.40) 



If the velocitiy v does not point along the direction of the 1-axis, the transformation 
(4.37) takes the form 
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L(-») 



-y— 8 

K c 



ik 



+ 



2 C i k 
A V V 



i + y 



(4.41) 



This more general expression is derived by means of the following steps. The ma- 
trix (4.37) that describes the case v — vej is symmetric. It transforms the time 
and the 1 -coordinates in a nontrivial way but leaves unchanged the directions per- 
pendicular to v. In particular, we have 



z'° — y[z° - ySz 1 ] = y z° 



1 

-V ■ z , 
c 



z' 1 = y[-/Sz° + z l ] = y 



z u -t- z 1 

c 



If v has an arbitrary direction in space, one could, of course, rotate the coordinate 
system by means of a rotation 7Z in such a way that the new 1-axis points along v. 
The boost L would then have precisely the form of (4.37). Finally, one could undo 
the rotation. As L is symmetric, so is the product TZ ~ 1 LIZ. Without calculating this 
rotation explicitly, we can use the following form for the boost that we wish to 
construct: 



L(-i» 






T ik ) 



with T ik = T ki . 



For vanishing velocity T lk becomes the unit matrix. Therefore, we can write T' k 
as follows: 



T ik = S ,K + a 



ik 



V V 



We determine the coefficient a by making use of our knowledge of the coordinates 
of O', the origin of K', in either system of reference. With respect to K' we have 



n — .. x o , ^2 T ik z k = 0 



z = — y-v z 
c 



k= 1 



As seen from K, O' moves at a constant velocity v, i.e. z k — v k z°/c. From these 
equations follows the requirement 



yv*- = - 

4 r r 



V 

1 + a 

c z 



i ! v 
v = y — 
c 



or 1 + afi 2 = y, and, finally, a = y 2 /{y + 1). This completes the construction of 
(4.41). 




256 



4. Relativistic Mechanics 



4.4.2 Interpretation of Special Lorentz Transformations 



First, we verify that the transformation (4.41) becomes the special Galilei trans- 
formation (4.34) whenever the velocity is small compared to c. We develop matrix 
(4.41) in terms of f3 = \v\/c, up to first order, viz. 



L(— 1 >) 





+ 0(p 2 ) . 



Neglecting the terms of order 0(j3 2 ), we indeed obtain t' = t and z. = —vt + z. 
Thus, the transformation rule (4.34) holds approximately for ( v/c ) 2 •C 1. This is 
an excellent approximation for the planets of our solar system. For example, the 
earth’s orbital velocity is about 30kms -1 , and therefore (v/c) 2 — 10 -8 . Elemen- 
tary particles, on the other hand, can be accelerated to velocities very close to the 
speed of light. In this case transformation (4.41) is very different from (4.34). 

An instructive way of visualizing the special Lorentz transformation (4.41) is 
to think of K' as being fixed in an elementary particle that moves at a constant 
velocity v with respect to the inertial system K. K' is then said to be the rest system 
of the particle, while K could be the laboratory system. 

Transformation (4.41) describes the transition between laboratory and rest sys- 
tem; it “boosts” the particle from its state of rest to the state with velocity v. To 
make this clear, we anticipate a little by defining the following four-vector: 



co = ( yc , yv) T . 



(4.42) 



(A more detailed reasoning will be given in Sect. 4.8 below.) The generalized, 
squared norm of this vector is or — y 2 c 2 ( 1 — ir/c 2 ) = c 2 . If we apply the matrix 
(4.41) to co, we obtain 

L(-v)a) = w (0) (4.43) 

with oj ((I 1 = (c, 0). Thus, this vector must be related to the relativistic general- 
ization of velocity or of momentum. We see that L(—v) transforms something 
moving with velocity v to something at rest (velocity 0), hence the minus sign 
in the definition above. We shall say more about the interpretation of co later and 
return to the analysis of the Lorentz transformations. 
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4.5 Decomposition of Lorentz Transformations 
into Their Components 

4.5.1 Proposition on Orthochronous, Proper Lorentz Transformations 

The structure of the homogeneous, proper, orthochronous Lorentz group L' + is 
clarified by the following theorem. 



Decomposition Theorem. Every transformation A of can be written, 
in a unique way, as the product of a rotation and a special Lorentz trans- 
formation following the rotation: 

Jr), Re SO(3) . (4.44) 

The parameters of the two transformations are given by the following ex- 
pressions: 



A = L(v)7 Z with 1Z — 



v‘/c = A’ 0 /A° 0 , (4.45) 

Rik - Ai « ~ TT^'o^ ' (4 - 46 ^ 

1 + A o 



Proof. As a first step one verifies that the velocity defined by (4.45) is an admis- 
sible velocity, i.e. that it does not exceed the speed of light. This follows from 
(4.24) (with a = 1): 

A T gA = g (4.47) 



or A» gllv A v x = g aT . (4.47') 

Choosing a — r = 0, then a = i, z — k, and then a — 0, t = i, we find that 
(4.47) yields the following equations, respectively, 

3 

(A° 0 ) 2 - = 1 • (4-. 48a) 

i= 1 
3 

A° A° k - J2 A A{ = -8 ik , (4.48b) 

7=1 

A A -£ * X- = o. 



(4.48c) 
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Now, from (4.48a) we indeed find that 

„ 2 L(^'p) 2 U°q) 2 ~ 1 

C 2 (A ° 0 ) 2 (A 0 Q ) 2 - 

Comparison with the general expression (4.41) for a boost then gives 
L° 0 (v) = A 0 0 ; Z°(tO = L { 0 (i>) = * 

L\{v) = S ik + —^o Al o< ■ (4-49) 

1 + A o 

As a second step we define 

1Z d = L _1 (u)A = L(— u)A (4.50) 

and show that 7Z is a rotation. This follows by means of (4.48a) and (4.48c) and 
by doing the multiplication on the right-hand side of (4.50), viz. 

K = (A? - E^'o) 2 = 1 • 

i 

E = A A - E A A = 0 • 

j 

At the same time we calculate the space-space components of the rotation, 

n‘ k = A\ - A‘ 0 A° k + —^~oA E A A ■ 

i + /t 0 j 

Inserting (4.48c) in the right-hand side yields assertion (4.46). 

As a third and last step it remains to show that the decomposition (4.44) is 
unique. For this purpose assume that there are two different velocities v and v , as 
well as two different rotations R and R of SO(3), such that 

A = L (v)K = L(v)K 

holds true. From this we would conclude that 
L(—v)ATZ~ i = H = L(-v)L(v)KK-' . 

Taking the time-time component of this expression, for example, we would obtain 

2 r i 

1 = E A(-V)L V 0 (V) = i — y v ' 1 

v=o L c 




This equation can be correct only if u = v. If this is so then also 1Z — TZ. Thus 
the theorem is proved. □ 
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4.5.2 Corollary of the Decomposition Theorem and Some Consequences 



Note the order of the factors of the decomposition (4.44): the rotation 1Z is applied 
first and is followed by the boost Lie). One could prove the decomposition of 
A e L + , with a different order of its factors, as well, viz. 

Jr), Re SO(3) , (4.51) 



A — IZL(w) with 1Z = 



where the vector w is given by 

uS_ def A 0 ; 

c A® 

t /i 0 



(4.52) 



and where R is the same rotation as in (4.46). The proof starts from the relation 



AgA J = g , 



(4.53) 



A"g^A\, = g” 



(4.530 



which is the analog of (4.47) and which says no more than that if A belongs to 
Z,]_. then its inverse A -1 = gA T g also belongs to L+. Otherwise the steps of 
the proof are the same as in Sect. 4.5.1. One verifies, by direct calculation, that 
v — Rw. This is not surprising because, by comparing (4.44) and (4.51), we find 



L(u) = 1ZL(w)1Z~ 1 = L(Rui) . 



(4.54) 



The decomposition theorem has several important consequences. 

(i) The decomposition is useful in proving that every timelike four vector can 
be mapped to the normal form (4.27a), every spacelike vector to the normal form 
(4.27b), and every lightlike vector to the form (4.27c). We choose the example of 
a timelike vector, z = (z°, z) with z 2 = (z 0 ) 2 — (z) 2 > 0. By a rotation it assumes 
the form (z , z , 0, 0). If z° is negative, apply PT to z so that z° becomes positive 
and hence z° > | z 1 1 . As one verifies by explicit calculation, the boost along the 
1-axis, with the parameter '/, as obtained from 

= yj (z° - z 1 )/(z° + z 1 ) , 

takes the vector to the form of (4.27a). 

(n) The group L + is a Lie group and contains the rotation group SO(3) as a 
subgroup. The decomposition theorem tells us that L + depends on six real param- 
eters: the three angles of the rotation and the three components of the velocity. 
Thus, its Lie algebra is made up of six generators. More precisely, to the real an- 
gles characterizing the rotations there correspond the directions of the boosts and 
the rapidity parameter X. This parameter has its value in the interval [0, oo]. While 
the manifold of the rotation angles is compact, that of X is not. Indeed, the Lorentz 




260 



4. Relativistic Mechanics 



group is found to be noncompact. Therefore, its structure and its representations 
are not simple and must be studied separately. This is beyond the scope of this 
book. 

( 111 ) It is not difficult, though, to construct the six generators of L + . We already 
know the generators for rotations, see Sect. 2.22. Adding the time-time and space- 
time components they are 



/° 


O 

O 


0 




0 


(J/) 


\0 


/ 



(4.55) 



where (J;) are the 3x3 matrices given in (2.71). The generators for infinitesi- 
mal boosts are derived in an analogous manner. The example of a special Lorentz 
transformation along the 1-axis (4.39) contains the submatrix 

A def /cosh A sinhA = A X 2n X 2n+i 

I sinhA cosh X J ^ (2 n )\ ^ (2 n + 1)! 

n = 0 n = 0 

with k=(; j) . 

The latter matrix (it is the Pauli matrix a 1 1 *) has the following properties: 

K 2 " = n , K 2,,+1 = K . 



Therefore, we have 



a = £ 



n = 0 



\ 2 n 



(2 n)\ 



K 



2 n 



x 2n + 1 

(2n + 1)! 



K 



2« + t 



= exp(kK) . 



Alternatively, writing this exponential series by means of Gauss’s formula for the 
exponential. 



A = lim ( 11 + — K 

k-+oo 



we see that the finite boost is generated by successive application of very many 
infinitesimal ones. From this argument we deduce the generator for infinitesimal 
boosts along the 1-axis: 



(0 1 
1 0 
0 0 
v o 0 



0 0\ 
0 0 
0 0 
0 Oy 



(4.56) 
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It is then easy to guess the analogous expressions for the generators K2 and K3 
for infinitesimal boosts along the 2-axis and the 3-axis, respectively: 



/0 0 1 0 \ 
0 0 0 0 
10 0 0 
v 0 0 0 Oy 



/0 0 0 1 \ 
0 0 0 0 
0 0 0 0 
v l 0 0 Oy 



(4.57) 



By the decomposition theorem every A of L + can be written as follows: 



A — exp (—<p ■ J) exp(Au) • K) , 



(4.58) 



where J = (Ji,J 2 ,J 3 ), K=(K 1 ,K 2 ,K 3 ), and X — arctanh |u;|/c. 

(iv) It is instructive to compute the commutators of the matrices J, and K& as given 
by (4.56), (4.57), and (2.71). One finds that 



[J 1 , J 2 ] = J 1 J 2 — J 2 J 1 = J 3 - 


(4.59a) 


[Ji,Ki] = 0, 


(4.59b) 


[J 1 ,K 2 ] = K 3 [Ki,J 2 ] = K 3 , 


(4.59c) 


[Ki, K 2 ] = — J 3 . 


(4.59d) 



All other commutators are obtained from these by cyclic permutation of the indices. 

One can visualize the meaning of relations (4.59a-d) to some extent by recall- 
ing that the J/ and generate infinitesimal transformations. For instance, (4.59a) 
tells us that two infinitesimal rotations by the angle £1 about the 1-axis and by the 
angle £ 2 about the 2-axis, when inverted in different order, give a net rotation about 
the 3-axis, by the angle £i£ 2 , 

R“‘(0, £ 2 ,0)R _1 (£i,0, 0)R(0,£ 2 ,0 )R(£i,0,0) = R(0, 0, £1 • £ 2 ) + 0(£?) 
(The reader should work this out). 

Equation (4.59b) states that a boost along a given direction is unchanged by 
a rotation about the same direction. Equation (4.59c) expresses the fact that the 
three matrices (Kj , K 2 , K 2 ) transform under rotations like an ordinary vector in 
]R 3 (hence the notation in (4.58)). 

The commutation relation (4.59d) is the most interesting. If one applies a boost 
along the 1-axis, followed by a boost along the 2-axis, and then inverts these trans- 
formations in the “wrong” order, there results a pure rotation about the 3-axis. In 
order to see this clearly, let us consider 

l_i ~ H -+- A 1 K 1 -)- , L 2 ~ 11 + A. 2 K 2 + 

with Xj « 1. To second order in the /,, we then obtain 
L 2 1 L ( 1 L 2 Lj ~ H — Ai A 2 [Kj . K 2 ] = 1 + kik 2 J 3 . 



(4.60) 
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Here is an example that illustrates this result. An elementary particle, say the elec- 
tron, carries an intrinsic angular momentum, called spin. Let this particle have the 
momentum po = 0. The series of Lorentz transformations described above even- 
tually bring the momentum back to the value po — 0. However, the spin is rotated 
a little about the 3-axis. This observation is the basis of the so-called Thomas 
precession, which is discussed in treatises on special relativity and which has a 
number of interesting applications. 



4.6 Addition of Relativistic Velocities 



The special Galilei transformation (4.34), or the special Lorentz transformation 
(4.41), relates the inertial systems Ko and K', the parameter v being the relative 
velocity of the two systems of reference. For example, one may think of K' as 
being fixed in a particle that moves with velocity v relative to an observer who is 
placed at the origin of Ko. We assume that the absolute value of this velocity is 
smaller than the velocity of light, c. Of course, the system of reference Ko can be 
replaced with any other one, K[, moving with constant velocity w relative to Ko 
(|u>| being assumed smaller than c, too). What, then, is the special transformation 
(the boost) that describes the motion of K', the particle’s rest system, as observed 
from Ki? 

In the case of the Galilei transformation (4.34) the answer is obvious: K' moves 
relative to Kj with the constant velocity u — v + w. In particular, if v and w are 
parallel and if both |u| and | w| exceed c/2, then the magnitude of u exceeds c. 

for velocities is different. Without 
1-axis. Let X be the corresponding 



In the relativistic case the law of addition 
restriction of generality let us take v along the 
rapidity parameter. 



D V 

tanh X — — = - 

c c 



or 



_ ll + v/c 



1 — v/c 



so that the transformation between Kq and K' reads 



L(u = vei) — 



( cosh X 
sinh X 
0 

V 0 



sinh /. 
cosh X 
0 
0 



0 0 \ 
0 0 
1 0 
0 l) 



(4.61) 



A case of special interest is certainly the one where w is parallel to v and 
points in the same direction, i.e. the one where one boosts twice along the same 
direction. L(u> = we \ ) has the form (4.61), with X being replaced by the parameter 
;U, which fulfills 



w 



tanh fi — — , 



m _ 



1 + w/c 



c 



or e' 



1 — w/c 
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The product \-{we\ )L(t>ei) is again a special Lorentz transformation along the 1- 
direction. Making use of the addition theorems for hyperbolic functions one finds 



L(wei)L(uei) = 



( cosh(A + /t-) sinh (A + fi) 0 0\ 
sinh (A + At) cosh(A + aO 0 0 
0 0 

0 0 



From this follows the relation 



X.+H _ j l + u/c _ / (1 + u/c)(l + w/c) 



e ^ — 



0 



= L(wei) 



1 — u/c y (1 — v /c ) ( 1 — w/c) 
which, in turn, yields the rule for addition of (parallel) velocities, viz. 



u v/c + w/c 
c 1 + vw/c 2 



(4.62) 



This formula has two interesting properties. 

(i) If both velocities v and w are small compared to the speed of light, then 

u = v + w + O (vw/c 2 ) . (4.63) 



Thus, (4.62) reduces to the nonrelativistic addition rule, as expected. The first rel- 
ativistic corrections are of order 1/c 2 . 

(ii) As long as v and w are both smaller than c, this holds also for u. If one 
of them is equal to c, the other one being still smaller than c, or, if both are equal 
to c, then u is equal to c. In no case does u ever exceed c. 

When v and w do not point in the same direction matters become a little more 
complicated, but the conclusion remains unchanged. As an example, let us consider 
a boost along the 1-axis, followed by a boost along the 2-axis. This time we choose 
the form (4.37), or (4.41), noting that the parameters y and /) are related by 



Yi = l//l - Pf or frYi = y yf - 1 , * = 1,2. 



Multiplying the matrices L(U 2 ^ 2 ) and L(uiei) one finds that 



A = L(u 2 e 2 )L(vici) = 



( YiY2 
YiPi 
Y 1 Y 2 P 2 

V 0 



Y 1 Y 2 P 1 Y 2 P 2 

Yi 0 0 

Y 1 Y 2 P 1 P 2 Y 2 0 

0 0 1 / 



(4.64) 



(4.65) 



This transformation is neither a boost (because it is not symmetric) nor a pure 
rotation (because A° 0 is not 1). Being the product of two boosts it is an element 
of L + . Therefore, it must be a product of the two kinds of transformations, one 
boost and one rotation. The decomposition theorem (Sect. 4.5.1) in the form of 
(4.44), when applied to A, gives 




264 



4. Relativistic Mechanics 



A = L (u)K(tp) 



with u' /c = A 1 q/Aq = (Pi/y 2 , P 2 , 0), while the equations (4.46) for the rotation 
give, making use of (4.64), 

R n = R 22 = (y, + y 2 )/(l + yin) , ^ 33 = 1 - 

R 12 = _ R 21 = _J (y 2 _ 1)(y 2 _ 1)/(1 + ) ^13 = i? 23 = 0=/? 31 = /? 32 > 



Thus, the rotation is about the 3-axis, (p = e 3 , the angle being 
<P = ~ arctan ^(yj 2 - l)(y 2 2 - l)/(yi + y 2 ) 

P1P2/ I 



= — arctan 



1-^r + V 1 - 



For the velocity u one finds 

\2 

v C > 



(“) 2 = tf + pl - pUI = 



2, ,2 



tix 2 



- 1 



2 2 

Ti y 2 



(4.66a) 



(4.66b) 



so that, indeed, |u| < c. We note that if v and w have arbitrary relative directions 
the parameter y pertaining to u is equal to the product of yi, y 2 , and {l + v-w/c 2 ). 
Whenever both y,- are larger than 1, then y is also larger than or equal to 1, and 
hence the parameter fi pertaining to u is smaller than 1 . In other words, |h| never 
exceeds c. 

These somewhat complicated relationships simplify considerably when all ve- 
locities are small compared to the speed of light. In Sect. 4.4.2 we already checked 
that the nonrelativistic limit of a special Lorentz transformation L(i>) yields pre- 
cisely the corresponding special Galilei transformation. If in (4.65) both V] and i > 2 
are small compared to c, we obtain 



u ~ vie 1 + 1 ) 2^2 , 



<p — — arctan 



V1V2 





~ 0 . 



The two velocities add like vectors; the rotation about the 3-axis is the identity. The 
induced rotation in (4.66a) is a purely relativistic phenomenon. Locally, i.e. when 
expressed infinitesimally, it is due to the commutator (4.59d) that we discussed in 
Sect. 4.5.2 (iv). 



4.7 Galilean and Lorentzian Space-Time Manifolds 

While translations (in space and in time) and rotations (in space) are the same 
within the Galilei and Lorentz groups, the special transformations are different, in 
an essential way, in the two cases. As a consequence, the space-time manifolds 
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equipped with the Galilei group as the invariance group, or alternatively the Lorentz 
group, inherit a very different structure. This is what we wish to show in this 
section. 

We start with the example of a special (or boost) transformation with velocity 
w — fie along the 1-axis, understood to be a passive transformation. In the case 
of the Galilei group it reads, setting x° = ct, 



x 

x 



rO 

rl 



= x 1 - /3x° , 




(4.67) 



(Of course, (4.67) is independent of the speed of light, c is introduced here in view 
of the comparison with the relativistic case.) In the case of the Lorentz group it 
reads 



x'° = y[x° - Px l ] , x' 2 = x 2 , 

x n = y[—/3x° + x 1 ] , x' 3 = x 3 . 



(4.68) 



The coordinates x 1 ' refer to the inertial system K; the coordinates x' 11 refer to K', 
which moves, relative to K, with the velocity w — Pee j. Suppose we are given 
three mass points A, B,C , to which no forces are applied and whose coordinates 
at time t — 0 are = (0, 0, 0), x (B) = x <c ^ = (A, 0, 0), with respect to the 
system K. A is assumed to be at rest; B moves with the velocity v — 0. 1 ce \ ; 
C moves with the velocity w = Pee i in the same direction as B. We choose 
P — 1/V3 ~ 0.58. All three of them move uniformly along straight lines in the 
(x 1 , r)-plane. After time t = A/c, for example, they have reached the positions 
A i, B\ . Ci, respectively, indicated in Figs. 4.3a and b. If one follows the same 
motions by placing an observer in the system of reference K', then in a Lorentz 
invariant world the picture will be very different from the one in a Galilei invariant 
world. 

(i) According to the nonrelativistic equations (4.67), the positions of the three 
mass points with respect to K' and at t' = 0 coincide with those with respect to K. 
After the time t = A/c they have reached the positions Aj, fij . C ' v respectively, 
shown in Fig. 4.3a. The figure shows very clearly that time plays a special role, 
compared to space. Events that are simultaneous with respect to K are observed 
by an observer in K' at the same times, too. As was explained in Sect. 1.14 (ii) 
it is not possible to compare spatial positions of points at different times without 
knowing the relation (4.67) between the two systems (e.g. comparing Ao with 
Ai, Aq = Ao with A'| ). However, the comparison of positions taken on at equal 
times, is independent of the system of reference one has chosen, and therefore it 
is physically meaningful. To give an example, if an observer in K and another 
observer in K' measure the spatial positions of A and C at time t = t' = 0, as 
well as at any other time t — t' , they will find that A and C move uniformly along 
straight lines and that the difference of their velocities is w = Pee 

(ii) If the two systems of reference are related by the Lorentz transformation 
(4.68), instead of the Galilei transformation (4.67), the observer in K' sees the 
orbits Aq A j , B' {) B'^ , CqCJ as shown in Fig. 4.3b. This figure leads to two important 
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Ct 



A ' \ 



ct ' 



Ai 



W 4 




Bn Cn 



Fig. 4.3. Three mass points moving 
uniformly, but with different velocities, 
along straight lines. They are observed 
from two different inertial systems K and 
K'. (a) K and K* are related by a special 
Galilei transformation, (b) K and K f are 
related by a special Lorentz transforma- 
tion 



! t = o ) 




observations. Firstly, simultaneity of events is now dependent on the system of 
reference. The events Aq and Bq = Co, which are simultaneous with respect to K, 
lie on the straight line x'° = —fix' 1 , when observed from K 7 , and hence occur at 
different times. (Similarly, the events A\, B\ and C i are simultaneous with respect 
to K. In K' they fall onto the straight line x'° = — fix ' 1 + y( 1 — fi 2 ).) Secondly, 
Fig. 4.3b shows a new symmetry between .r 0 and x 1 , which is not present in the 
corresponding nonrelativistic figure (4.3a). The images of the lines t — 0 and 
x 1 = 0, in K', are symmetric with respect to the bisector of the first quadrant. (As 
we assigned the coordinates (A, 0) to Bq, (0, A) to Ai, their images B' 0 and A’ v 
respectively, have symmetric positions with respect to the same straight line, too.) 

More generally, what can we say about the structure of Galilean space-time 
and of Minkowskian space-time? Both are smooth manifolds with the topology 
of R 4 . The choice of a coordinate system is usually made with regard to the local 
physical processes one wishes to describe and may be understood as the choice 
of a “chart” taken from an “atlas” that describes the manifold. (These notions are 
given precise definitions and interpretations in Chap. 5.) 
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(i) Galilei invariant space-time . In a world where physics is invariant under Galilei 
transformations, time has an absolute nature: the statement that two events take 
place at the same time is independent of their spatial distance and of the coordi- 
nate system one has chosen. Call Pq the (four-dimensional) Galilean space-time; 
M = K, the (one-dimensional) time manifold. Suppose first that we choose an ar- 
bitrary coordinate system K with respect to which the orbits of physical particles 
are described by world lines it . x it)). Consider the projection 

7t : Pg -> M : (t, jc) i-> t , (4.69) 

which assigns its time coordinate t to every point of the world line (t,x) e Pg- 
Keeping t fixed, the projection it in (4.69) collects all x that are simultaneous. If 
x' and t' are the images of these x and the fixed t, respectively, under a general 
Galilei transformation 

t' — t + s , x' — Rx + wt + a , (4.70) 

then the projection defined in (4.69) again collects all simultaneous events, 

7T : ( t ' , x') i->- t' . 

Thus, the projection has a well-defined meaning, independent of the specific co- 
ordinate system one chooses. Consider now an interval I of R, that contains the 
time t. The preimage of I with respect to it has the structure (time interval) x 
(three-dimensional affine space), 

7T _1 : I — > t r _1 (7) e Pg , isomorphic to I x E 3 . (4.71) 

In the terminology of differential geometry this statement and the properties that 
7 r has mean that Pg is an affine fibre bundle over the base manifold M = with 
typical fibre E 3 . (We do not give the precise definitions here.) 

The world line in (4.69) refers to a specific (though arbitrary) observer’s system 
K, the observer taking his own position as a point of reference. This corresponds 
to the statement that one always compares two (or more) physical events in Pq. 
The projection (4.69) asks for events, say A and B, which are simultaneous, i.e. 
for which t\ = fg. This suggests defining the projection in a truly coordinate-free 
manner as follows. Let xa — ( t,\ . x t ) and xg = it n, x n ) be points of Pq. The 
projection declares all those points to be equivalent, xa ~ xp t , for which 1a = t/j. 

What else can we say about the structure of Pg? If in (4.70) we exlude the 
special transformations, by taking w = 0, then there would exist a canonical pro- 
jection onto three-dimensional space that would be the same for any choice of 
the coordinate system. In this case Pq would have the global product structure 
t, x E 3 . A fibre bundle that has this global product structure is said to be triv- 
ial. However, if we admit the special transformations (id f 0) in (4.70), then our 
example discussed earlier and the more general case illustrated by Fig. 4.4 show 
that the realization of the projection is not independent of the system of reference 
one has chosen. Although the bundle 
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Fig. 4.4. In Galilean space-time Pq, time has an 
absolute character. However, the projection onto 
the spatial part of Pq depends on the system of 
reference one chooses 



Pg(tc : Pq — > M — K ? , Fibre F — E 3 ) (4.72) 

has the local structure K, x E 3 , it is not trivial in the sense defined above. 

(ii) Lorentz invariant space-time. The example given by (4.68) and Fig. 4.3b shows 
clearly that the space-time endowed with the Lorentz transformations does not 
have the bundle structure of Galilean space-time. Neither the projection onto the 
time axis nor that onto three-dimensional space can be defined in a canonical way, 
i.e. independently of a coordinate system. On the contrary, space and time now 
appear as truly equivalent, Lorentz transformations mixing space and time in a 
symmetric way. 

Not only spatial distances but also time differences now depend on the inertial 
system one chooses. (There is a correlation between spatial and time distances, 
though, because (jC(i) — CC( 2 )) 2 — (X (\ ( — x^) 2 must be invariant.) As a consequence, 
moving scales look shorter, while moving clocks tick more slowly. These are new 
and important phenomena to which we now turn. 



4.8 Orbital Curves and Proper Time 

The example illustrated by Fig. 4.3b reveals a surprising, and at first somewhat 
strange, property: a given process of physical motion takes different times, from its 
beginning to its end, if it is observed from different systems of reference. In order 
to get rid of this dependence on the system of reference, it is helpful to think of the 
moving objects A, B, and C as being equipped with their own clocks and, if they 
are extended objects, with their own measuring scales. This is useful because then 
we can compare their intrinsic data with the data in other systems of reference. In 
particular, if the motion is uniform and along a straight line, the comoving systems 
of reference are inertial and the comparison becomes particularly simple. 

In the case of arbitrary, accelerated motion, the best approach is to describe 
the orbit curve in a geometrical, invariant manner, by means of a Lorentz-invariant 
orbital parameter. In other words one writes the world line of a mass point in the 
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form x(r), where r is the arc length of this world line, r is an orbital parameter 
that is independent of any system of reference. Of course, instead of the dimen- 
sion length, we could give it the dimension time, by multiplication with 1/c. The 
function x(t) describes the spatial and temporal evolution of the motion, in a ge- 
ometrically invariant way. If r is given the dimension of time, by multiplication 
with an appropriate constant with dimensions, one can understand r to be the time 
shown by a clock that is taken along in the motion. For this reason, r is called 
the proper time. 

Note, however, that the world line x(r) cannot be completely arbitrary. The 
particle can only move at velocities that do not exceed the speed of light. This 
is equivalent to the requirement that there must exist a momentary rest system 
at any point of the orbit. If we choose an arbitrary inertial system of reference, 
x(r) has the representation x(r) — (r°(r), x(r)). Given x(t) the velocity vector 
x — (x°, x) T is defined by 

i" = f — ,r M (r) . (4.73) 

dr 

In order to satisfy the requirement stated above, this vector must always be time- 
like (or lightlike), i.e. (i 0 ) 2 > x 2 . If this is fulfilled, then the following statement 
also holds true: if = d.r°/dr > 0 holds in one point of the orbit, then this holds 
everywhere along the whole orbit. (Figure 4.5 shows an example of a physically 
possible orbital curve in space-time.) Finally, one can parametrize the orbital pa- 
rameter r in such a way that the (invariant) norm of the vector (4.73) always has 
the value c: 



= x»g^x v = c . (4.74) 

For a given value of the parameter r = tq, x at the world point (to, x(ro)) can be 
brought to the form x = (c, 0, 0, 0) by means of a Lorentz transformation. Thus, 




Fig. 4.5. Example of a physically allowed world 
line. At any point of the orbit the velocity vector 
is timelike or lightlike (i.e. in the diagram its 
slope is greater than or equal to 45°) 
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this transformation leads to the momentary rest system of the particle, and we have 
d v ° 1 

= c , i.e. dr = -d.v° = df (at r — to) . (4.75) 

dr c 

Note that i is precisely the vector &> of (4.42) and that the transformation to the 
rest system is precisely the one given in (4.43). 

The result (4.75) can be interpreted in the following way. If the particle carries a 
clock along its orbit, this clock measures the proper time r. Read as a geometrical 

dcf 

variable, r is proportional to the length of arc, s — cx. This is so because the 
invariant (squared) line element d.v 2 is given by 

d.v 2 = c 2 dr 2 = dx^g^ v dx v = c 2 (df) 2 — (d*) 2 . (4.76) 

This expression emphasizes again the role of g^ v as the metric tensor. 



4.9 Relativistic Dynamics 

4.9.1 Newton’s Equation 

Let K be an inertial system of reference and let a particle move with (momentary) 
velocity v relative to K. Further, let Ko be the rest system of the particle, the 
axes of Ko being chosen parallel to those of K. The relation between the two 
systems is then given by the special Lorentz transformation (4.41), with velocity 
v , as indicated in the following: 

L(-») 

Ko ^ K. (4.77) 

L(») 

In trying to generalize Newton’s second law (1.8) to relativistic dynamics, we must 
take care of two conditions. 

(i) The postulated relation between the generalized acceleration d 2 x(r)/dt 2 and 
the relativistic analog of the applied force must be form invariant with respect 
to every proper, orthochronous Lorentz transformation. An equation of mo- 
tion that is form invariant (i.e., loosely speaking, both sides of the equation 
transform in the same way), is also said to be covariant. Only if it obeys this 
condition will the equation of motion describe the same physics, independent 
of the reference system in which it is formulated. 

(ii) In the rest system of the particle, as well as in cases where the velocities are 
small compared to the speed of light, | v| «c, the equation of motion becomes 
Newton’s equation (1.8). 

Let m be the mass of the particle as one knows it from nonrelativistic mechanics. 
The observation that this quantity refers to the rest system of the particle suggests 
that we should regard it as an intrinsic property of the particle that has nothing to 
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do with its momentary state of motion. For this reason, this quantity is said to be 
the rest mass of the particle. In the case of elementary particles the rest mass is 
one of the fundamental properties characteristic of the particle. For example, the 
electron has the rest mass 

m e = (9.109 3897 ± 0.0000027) x 10“ 31 kg , 



while the muon, which otherwise has all the properties of the electron, is charac- 
terized by its rest mass being heavier, viz. 



m /t ~ 206.77 m e . 



Very much like proper time r, the rest mass m is a Lorentz scalar. Therefore, with 
the following form for the generalized equation of motion: 



m 



d^ 

dr 2 



x^z) = r , 



(4.78) 



the left-hand side is a four-vector under Lorentz transformations. Condition (i) 
states that f IA must be a four-vector as well. If this is so, we can write down 
the equation of motion (4.78) in the rest system Ko, where we can make use of 
the second condition (ii). With respect to Ko and by (4.75), dr = dr. Hence the 
left-hand side of (4.78) reads 



d 2 

^dr 2 * < T) 





m( 0, x) . 



Condition (ii) imposes the requirement 



r i Ko = (o, k) , 



where K is the Newtonian force. We calculate f 1 ' with respect to the inertial 
system K, as indicated in (4.77): 



n K 



£ L»r iko . 



v=0 



(4.79) 



Writing this out in space and time components, we have 

y 2 i 

/ = K + -L— -(v ■ K)v , 

1 + yc- 

f° = y-(vK)=-(vf), (4.80) 

c c 

where we have used the relationship ft 2 = (y 2 — 1 )/y 2 . Thus, the covariant force 
f 11 is nothing but the Newtonian force (0, K ), boosted from the rest system to K. 
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4.9.2 The Energy-Momentum Vector 

The equation of motion (4.78) obtained above suggests defining the following rel- 
ativistic analog of the momentum p: 

p^=m—x^(r). (4.81) 

dr 

When evaluated in the rest system this takes the form 
Iko = (me, 0 ) . 

If it is boosted to the system K, as in (4.79), it becomes 

p 11 |k = ( ymc , ymv) . (4.82) 

The same result can be obtained in an alternative way. From (4.76) we see that 
dr along an orbit is given by 

dr = yj (At) 2 — (d x) 2 /c 2 — yj 1 — fi 2 dt — d t/y . 

Equation (4.81), on the other hand, when evaluated in K, gives 
n d 

p — my — (ct) — mey , (4.82a) 

dr 

d 

p — my — x — my v . (4.82b) 

dr 

The Lorentz scalar parameter m is the rest mass of the particle. It takes over the 
role of the well-known mass parameter of nonrelativistic mechanics whenever the 
particle is at rest or moves at small velocities. Note that the nonrelativistic relation 
p — m v is replaced by (4.82b), i.e. the mass is replaced by the product of the rest 
mass m and y . For this reason the product 

def 1 

m(v) — my = m 

yi - v 2 /c 2 

is sometimes interpreted as the moving, velocity-dependent mass. It is equal to the 
rest mass for v — 0 but tends to (plus) infinity when r approaches the speed of 
light, c. As stated in Sect. 1.4 it is advisable to avoid this interpretation. 

The time component of the four- vector //' , when multiplied with c, has the 
dimension of energy. Therefore, we write 



p^=(-E,p) 


with E — ymc 2 , 


P = ymv . 


\c ) 







(4.83) 



This four-vector is said to be the energy-momentum vector. Clearly, its squared 
norm is invariant under Lorentz transformations. It is found to have the value 
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P 2 = (. P , p) = (p 0 ) 2 - P 2 = \e 2 - p 2 = m 2 c 2 . 

c z 

This last equation yields the important relativistic relationship 




(4.84) 



between the energy E and the momentum p of a free particle. This is the relativistic 
generalization of the energy-momentum relation we anticipated in (4.7). If p — 0, 
then E = me 2 . The quantity me 2 is called the rest energy of the particle with mass 
m. Thus, E always contains this contribution, even when the momentum vanishes. 
Consequently, the kinetic energy must be defined as follows: 



T = f E 



(4.85) 



The first test, of course, is to verify that the well-known relation T = p 2 /2m is 
obtained from (4.85) for small velocities. Indeed, for /f <$C 1 , 




2m 



1 - 



4 m 2 c 2 



— Tnonrel 



(P 2 ) 2 

8/h 3 c 2 



Clearly, only a complete dynamical theory can answer the questions raised in 
Sect. 4.1. Nevertheless, the relativistic equation opens up possibilities that were not 
accessible in nonrelativistic mechanics, and that we wish at least to sketch. Any 
theory of interactions between particles that is invariant under Lorentz transforma- 
tions contains the equation (4.84) for free particles. The following consequences 
can be deduced from this relation between energy and momentum. 

(i) Even a particle at rest has energy, E( v = 0) — me 2 , proportional to its 
mass. This is the key to understanding why a massive elementary particle can decay 
into other particles such that its rest energy is converted, partially or entirely, into 
kinetic energy of the decay products. For example, in the spontaneous decay of a 
positively charged pion into a positively charged muon and a neutrino, 

Jt + (m n — 273.13 7« e ) — >■ P- + ( ttifi = 206.77 m e ) + v(m v — 0) , 

about one fourth of its rest mass, namely ((m n — m^/m^nixC 2 , is found in the 
form of kinetic energy of the ji + and the v. This is calculated as follows. Let 
(Eq/c,q), (Ep/c,p), and {E k /c, k) denote the four-momenta of the pion, the 
muon, and the neutrino, respectively. The pion being at rest before the decay (cf. 
Fig. 4.6) we have 

= (m n c, 0 ) , E p = J (m^c 2 ) 2 + p 2 c 2 ; E k = \k\c . 






P 



Tt + k=-p 

— • v 

(cp= 0 ) 



Fig. 4.6. A positively charged pion at rest decays into 
a (positively charged) muon and an (electrically neutral) 
neutrino 
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By conservation of energy and momentum 

q M + k' x , k = -p , and 

m n c 2 = yjim^c 2 ) 2 + p 2 c 2 + \p\c . 

This allows us to compute the absolute value of the momentum p or k. viz. 

9 2 

m - tn 

\p\ = |*| = — = 58.30 m e c . 

2m „ 

Therefore, the kinetic energy of the neutrino is T ^ — 58.30 m c c 2 , while 

that of the muon is 

= E p — m^c 2 — 8.06 m e c 2 . 

Thus, = 66.36 m e c 2 — 0.243 m n c 2 , as asserted above. The lion’s 

share of this kinetic energy is carried away by the neutrino, in spite of the fact 
that muon and neutrino have equal and opposite momenta. On the other hand, the 
muon shares the major part of the total energy, namely E p = 2 1 4.8 m e c 2 , because 
it is massive, 

(ii) In contrast to nonrelativistic mechanics, the transition to vanishing rest mass 
poses no problems. For m — 0 we have E = \p\c and p 1 ' — (\p\. p). A particle 
without mass nevertheless carries both energy and momentum. Its velocity always 
has magnitude c, cf. (4.82), no matter how small p is. However, it does not have 
a rest system. There is no causal way of following the particle and of “catching 
it up” because the boosts diverge for |v| -* c. 

We already know an example of massless elementary particles: the photons. 
Photons correspond to the elementary excitations of the radiation field. As they are 
massless, one is led to conjecture that the theory of the electromagnetic radiation 
field cannot be based on nonrelativistic mechanics. Rather, this theory (which is 
the subject of electrodynamics) must be formulated within a framework that con- 
tains the speed of light as a natural limit for velocities. Indeed, Maxwell’s theory of 
electromagnetic phenomena is invariant under Lorentz transformations. Neutrinos 
some of which have nonvanishing though very small masses, can often be treated 
as being massless. 

We now summarize the findings of this section. The state of a free particle of 
rest mass m is characterized by the energy-momentum four- vector p 1 ' — ( E/c , p), 
whose norm is invariant and for which 




We note that this four-vector is always timelike, or lightlike if m — 0. If, as shown 
in Fig. 4.7, we plot the time component p° as the ordinate, the space components 
(symbolically) as the abscissa, p ,L is found to lie on a hyperboloid. As the energy 
E must be positive, only the upper part of this hyperboloid is relevant. The surface 
obtained in this way is said to be the mass shell of the particle with mass m. It 
describes either all physically possible states of the free particle, or, alternatively, 
a fixed state with energy-momentum p ,L as observed from all possible inertial 
systems of references. 
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Fig. 4 . 7 . Schematic representation of energy and momen- 
tum of a particle. The points (p° = E/c,p) lie on the 
upper half of the hyperboloid 



4.9.3 The Lorentz Force 



A charged particle traversing external electric and magnetic fields at velocity v 
experiences the Lorentz force (2.29) or (1.49) that we discussed in the context 
of nonrelativistic dynamics. Here we wish to derive the corresponding relativistic 
equation of motion (4.78) in a covariant formulation. 

With respect to an inertial system of reference, where x = (yc, yv) T and 
d/dr = yd/dr, the spatial part of (4.78) reads 



d d 

my — p = my — (y v) = ye 

dr dr 



E + -v x B 

c 



(4.86) 



First we show that its time component follows from (4.86) and is given by 



d e 

my — (yc) = y-E ■ v . 
dr c 



(4.87) 



This is seen as follows. Calculating the scalar product of (4.86) and v/c, its right- 
hand side becomes ye/cE ■ v. Thus, one obtains 

e v d d 

y-E ■ v = my ( yv ) = mcyp — (yp) 

c c dr dr 

Id 9 1 d 9 9 

= me- — (yj8) 2 = me- — (y 2 0") , 

2 dr 2 dr 

where we set P = v/c. As 



y 2 P 2 = 



l - 



y 2 — i 



we find 



e 1 d 9 d 

y-E ■ v = me y = mey — y 

c 2 dr dr 
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which proves (4.87). Next we show that (4.86) and (4.87) can be combined to a 
covariant equation of motion, with u = x: 

m-id 1 — -F^ v u v . (4.88) 

dr c 

This means that the relativistic form of the Lorentz force is 

K' x = -F^'uy . (4.89) 

c 

Here, F llv is a tensor with respect to Lorentz transformations. It is antisymmetric, 
F v>x — — F^ v , because, with = const., (4.88) implies that u^F^Uv = 0. 
In an arbitrary inertial system it is given by 



/O 


-E l 


-E 2 


-e 3 \ 


E 1 


0 


-B 3 


B 2 


E 2 


B 3 


0 


-B l 


U 3 


— B 2 


B l 


(»/ 



(4.90) 



The requirement that F llv yield the Lorentz force fixes this tensor uniquely. To 
prove this, we note that u v (with a lower index) is u v — g va u a = (yc, —yv) and 
work out the multiplication on the right-hand side of (4.89). This indeed gives 
(4.86) and (4.87). 

The relativistic Lorentz force has a form that differs from the Newtonian force 
of Sect. 4.9.1. It is not generated by “boosting” a Newtonian, velocity-independent 
force but is the result of applying the tensor (4.90) to the velocity u 11 . This tensor, 
which is antisymmetric, is said to be the tensor of field strenghts. Its time-space 
and space-time components are the components of the electric field, 

F i0 = —F 0i = E l , (4.91a) 

while its space-space components contain the magnetic field according to 

F 21 = — F 12 = B 3 (and cyclic permutations). (4.91b) 

The covariant form (4.88) of the equation of motion for a charged particle in elec- 
tric and magnetic fields shows that these fields cannot be the space components 
of four-vectors. Instead, they are components of a tensor over Minkowski space 
M 4 , as indicated in (4.90) or (4.91). This means, in particular, that electric and 
magnetic fields are transformed into each other by special Lorentz transformations. 
For example, a charged particle that is at rest with respect to an observer gener- 
ates a static (i.e. time-independent), spherically symmetric electric field. If, on the 
other hand, the particle and the observer move at a constant velocity v relative to 
each other, the observer will measure both electric and magnetic fields. (See e.g. 
Jackson 1998, Sect. 11.10.) 
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Suppose we are given a clock that ticks at regular and fixed time intervals At 
and that we wish to read from different inertial systems. This idea is meaningful 
because precise measurements of time are done by measuring atomic or molecu- 
lar frequencies and comparing them with reference frequencies. Such frequencies 
are internal properties of the atomic or molecular system one is using and do not 
depend on the state of motion of the system. 

For an observer who sees the clock at rest with respect to his inertial system, 
two consecutive ticks are separated by the space-time interval {djc = 0, dr = At}. 
Using this data, he calculates the invariant interval of proper time with the result 



dr = yj~(d t) 2 — (dx) 2 /c 2 = At . 

Another observer who moves with constant velocity relative to the first observer, 
and therefore also relative to the clock, sees that consecutive ticks are separated 
by the space-time interval {At', Ax' = vAt'}. From his data he calculates the 
invariant interval of proper time to be 

dr' = -J (At') 2 - ( Ax') 2 /c 2 = y 1 - 0 2 At' . 



As proper time is Lorentz invariant, we have dr' = dr. This means that the sec- 
ond observer (for whom the clock is in motion) sees the clock tick with a longer 
period, given by 

At' — , At — yAt . (4.92) 



This is the important phenomenon of time dilatation: for an observer who sees 
the clock in motion, it is slower than at rest, i.e. it ticks at time intervals that are 
dilated by a factor y . In Example (iii) of Sect. 4.1 we discussed a situation where 
time dilatation was actually observed. The experiment quoted there confirms the 
effect predicted in (4.92) with the following accuracy: the difference At — At' /y 
is zero within the experimental error bar 



r ( 0 >Qu)-t ( ”V)/K 

r ( 0 ) (M) 



= (0.2 ±0.9) x 10“ 3 . 



Another, closely related effect of special Lorentz transformations is the scale 
or Fitzgerald-Lorentz contraction that we now discuss. It is somewhat more dif- 
ficult to describe than time dilatation because the determination of the length of a 
scale requires, strictly speaking, the measurement of two space points at the same 
time. As such points are separated by a spacelike distance, this cannot be a causal, 
and hence physical, measurement. A way out of this problem would be to let two 
scales of equal length move towards each other and to compare their positions at the 
moment they overlap. Alternatively, we may use the following simple argument. 
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Suppose there are two landmarks at the space points 

x (A) = (0, 0, 0) and x (B) = (L 0 , 0, 0) , 

where the coordinates refer to the inertial system Ko. Because we want to mea- 
sure their spatial distance, we ask an observer to make a journey from A to B, 
as shown in Fig. 4.8, with constant velocity v = ( v , 0, 0), and, of course, v < c. 
As seen from Ko he departs from A at time t — 0 and reaches the landmark B 
at time t = Tq, B having moved to C during this time in our space-time diagram 
(Fig. 4.8). In the case of Galilei transformations, i.e. for nonrelativistic motion, we 
would conclude that the distance is 

Lq = = vTq . 




Fig. 4.8. An observer traveling at constant velocity de- 
termines the distance from A to B. He finds L = Lq/y 



In the relativistic, Lorentz-invariant world we find a different result. When the 
traveler reaches C, his own, comoving clock shows the time T — Tq/y, with 
y = (1 — u 2 /c 2 ) -1 / 2 . Thus, he concludes that the length separating A and B is 

L — vT — vTo/y = L 0 /y . (4.93) 

In other words, the scale AB that moves relative to the traveling observer (with 
velocity —v) appears to him contracted by the factor 1/y. This is the phenomenon 
of scale contraction , or Fitzgerald-Lorentz contraction. 

One easily understands that scales oriented along the 2-axis or the 3-axis, or 
any other direction in the (2,3)-plane, remain unchanged and do not appear con- 
tracted. Therefore, the phenomenon of scale contraction means, more precisely, 
that an extended body that moves relative to an inertial system appears contracted 
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in the direction of its velocity v only. The spatial dimensions perpendicular to v 
remain unmodified. 

The book by Ellis and Williams (1994) contains an elementary but well il- 
lustrated discussion of time dilatation and scale contraction as well as the appar- 
ent paradoxes of special relativity. Although it was written for laymen, as Ruth 
Williams told me, it seems to me that this book is not only entertaining but also 
useful for the reader who wishes to get a better feeling for time and space in special 
relativity. 



4.11 More About the Motion of Free Particles 



By definition, the state of motion of a free particle is characterized by its relativistic 
energy-momentum vector (4.83) being on its mass shell, 

p 2 = E 2 /c 1 — p 2 — m 2 c 2 . (4.94) 



We wish to describe this relativistic motion without external forces by means of 
the methods of canonical mechanics. As we are dealing with free motion in a flat 
space, the solutions of Hamilton’s variational principle will be just straight lines 
in the space-time continuum. Therefore, we assume the action integral (2.27) to 
be given by the path integral between two points A and B in space-time, where 
A and B are timelike relative to each other: 

r B 

I[x ] = K d.y , with (x <B) - x (A) ) 2 > 0 . (4.95) 

J A 



As we showed in Sect. 2.36, the action integral is closely related to the generating 
function S*, which satisfies the equation of Hamilton and Jacobi. Assuming the 
solutions to be inserted in (4.95), we have 

S* = K [ ds . (4.96) 

J A 

Here the quantity K is a constant whose dimension is easy to determine: the action 
has the dimension (energy x time) and .v has the dimension (length). Therefore, 
K must have the dimension (energy/velocity), or, equivalently, (mass x velocity). 
On the other hand, I or S* must be Lorentz invariant. The only invariant param- 
eters, but those with dimension, are the rest mass of the particle and the velocity 
of light. Thus, up to a sign, K is the product me. In fact, as we show below, the 
correct choice is K — —me. 

With respect to an arbitrary, but fixed, inertial system we have d.y = c dr = 
y 1 — u 2 /c 2 c df, with v = dx/dt. Thus, 
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This yields the (natural form of) the Lagrangian function whose Euler-Lagrange 
equations describe relativistic free motion. Expanding this Lagrangian function in 
terms of v/c, we find the expected nonrelativistic form 



L = —mc 2 J 1 — xr~ f c~ — — me 2 + Imv 2 



(4.97) 



to which the term — me 2 is added. The form (4.97) for the Lagrangian function is 
not quite satisfactory because it refers to a fixed inertial system and therefore is not 
manifestly invariant. The reason for this is that we introduced a time coordinate. 
The time variable, being the time component of a four-vector, is not invariant. If 
instead we introduce some other, Lorentz-mvanant parameter r (we give it the 
dimension of time), then (4.95) reads 



I = 



—me 




d.r 0 ' Ax a 

dr dr 



(4.98) 



so that the invariant Lagrangian function reads 



/ d.Y" d.Y™ r^r 

ijnv = — me J = — mey x - , (4.99) 

V dr dr 

where x a — dx“/dr. One realizes again the x 2 must be positive, i.e. that x must 
be timelike. The Euler-Lagrange equations that follow from the action (4.98) are 

3Linv d 3Li nv q 

dx a dr dx a 



and hence 



d mexci 

dr Vi 1 



Here the momentum canonically conjugate to x a is 



Pa — 



9Tinv 

3i“ 



= —me 



\fk ^ 



(4.100) 



It satisfies the constraint 



p 2 — m 2 c 2 — 0 . 



(4.101) 



If we now attempt to construct the Hamiltonian function, following the rules of 
Chap. 2, we find that 



H = x a p a 




The essential reason the Hamiltonian function vanishes is that the description of 
the motion as given here contains a redundant degree of freedom, namely the time 
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coordinate of i. The dynamics is contained in the constraint (4.101). One also re- 
alizes that the Legendre transformation from L mv to H cannot be performed: the 
condition for this transformation to exist, 



det 



/ d 2 L im \ 

\3i^3i“/ 



# 0 , 



is not fulfilled. Indeed, calculating the matrix of second derivatives, one obtains 

3 2 L inv me 2 . 

dx^x a ~ (i 2 )3/2 [X 8aft XaXp] ~ 

The following argument shows that the determinant of this matrix vanishes. Define 

, def -2 • • 

Aafi — 4 Safi Xa-Xfi • 

The homogeneous system of linear equations A a pu^ — 0 has a nontrivial solution 
precisely if det A = 0. Therefore, if we can find a nonvanishing ^ (0, 0, 0, 0) 
that is solution of this system, then the determinant of A vanishes. There is indeed 
such a solution, namely — cx P, because for any x& ^ 0 

A a pX^ — XX a — X 2 Xa — 0 . 

For the first time we meet here a Lagrangian system that is not equivalent to a 
Hamiltonian system, in a canonical way. In fact, this is an example for a Lagrangian 
(or Hamiltonian) system with constraints whose analysis must be discussed sepa- 
rately. 

In the example discussed above, one could proceed as follows. At first one 
ignores the constraint (4.101) but introduces it into the Hamiltonian function by 
means of a so-called Lagrangian multiplier. With H as given above, we take 

H' = H + X'P(p) , with 'Pip) =* p 2 — m 2 c 2 ; 

X denoting the multiplier. The coordinates and momenta satisfy the canonical Pois- 
son brackets 



{x a ,xp} = 0 = {p a ,pp}-, {p a ,x p } = S a fi . 

The canonical equations read 

i“ = [H\ x 01 } = {X,x a }'P(p) + X{Pip), x a } 

= X{p 2 — m 2 c 2 , .*“} = X{p 2 , x 01 } — 2 Xp a , 
p a = {H',p a } = {k,p a }V(p) = 0, 

where we made use of the constraint Pip) — 0. With p a and x a being related by 
(4.100), we deduce X — —Vjc*/2mc. The equation of the motion is the same as 
above, p a =0. 
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4.12 The Conformal Group 



In Sect. 4.3 we argued that the laws of nature that apply to massive particles al- 
ways involve quantities with dimensions and therefore cannot be scale invariant. 
As a consequence, the transformation law (4.23) must hold with condition (4.24) 
and the choice a — 1. In a world in which there is only radiation, this restric- 
tion does not apply because radiation fields are mediated by massless particles 
(quanta). Therefore, it is interesting to ask about the most general transformations 
that guarantee the invariance of the form 

z 2 = 0 with z = xa — xb and xa , xb e M 4 . 

The Poincare transformations that we had constructed for the case a = 1 certainly 
belong to this class. As we learnt in Sect. 4.4, the Poincare transformations form 
a group that has 10 parameters. If only the invariance of z 2 — 0 is required, then 
there are two more classes of transformations. These are the dilatations 

x'^ = Xx^ with AeR, 



which depend on one parameter and which form a subgroup by themselves. Ob- 
viously, they are linear. 

One can show that there is still another class of (nonlinear) transformations 
that leave the light cone invariant. They read (see Exercise 4.15) 



X^ + X c 



2 /A 



1 + 2(c • x) + c 2 x 



2 v 2 



(4.102) 



They depend on four real parameters and are said to be special conformal transfor- 
mations. They form a subgroup, too: the unit is given by c ,/ = 0; the composition 
of two transformations of the type (4.102) is again of the same type, because 



+ x 2 c ^ 
cr(c, x) 



with a (c, x) = 1 + 2(c • x) + c 2 x 2 , 



r2 



cr(c,X ) 



and 



x'^+x^d 11 x^ + x 2 ^ + d^) 
o(d.x') cr(c + d,x ) 



Finally, the inverse of (4.102) is given by the choice cl M = — c M . Thus, one dis- 
covers the conformal group over Minkowski space M 4 . This group has 



10+1 + 4= 15 



parameters. It plays an important role in field theories that do not contain any 
massive particle. 
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In many respects, mechanics carries geometrical structures. This could be felt very 
clearly at various places in the first four chapters. The most important examples 
are the structures of the space-time continua that support the dynamics of non- 
relativistic and relativistic mechanics, respectively. The formulation of Lagrangian 
mechanics over the space of generalized coordinates and their time derivatives, 
as well as of Hamilton-Jacobi canonical mechanics over the phase space, reveals 
strong geometrical features of these manifolds. (Recall, for instance, the symplec- 
tic structure of phase space and Liouville’s theorem.) To what extent mechanics 
is of geometric nature is illustrated by the fact that, historically, it gave important 
impulses to the development of differential geometry. In turn, the modern formula- 
tion of differential geometry and of some related mathematical disciplines provided 
the necessary tools for the treatment of problems in qualitative mechanics that are 
the topic of present-day research. This provides another impressive example of 
cross-fertilization of pure mathematics and theoretical physics. 

In this chapter we show that canonical mechanics quite naturally leads to a 
description in terms of differential geometric notions. We develop some of the ele- 
ments of differential geometry and formulate mechanics by means of this language. 
For lack of space, however, this chapter cannot cover all aspects of the mathemat- 
ical foundations of mechanics. Instead, it offers an introduction with the primary 
aim of motivating the necessity of the geometric language and of developing the 
elements up to a point from where the transition to the mathematical literature on 
mechanics (see the list of references) should be relatively smooth. This may help 
to reduce the disparity between texts written in a more physics-oriented language 
and the modern mathematical literature and thus to encourage the beginner who 
has to bridge the gap between the two. At the same time this provides a starting 
point for catching up with recent research developments in modern mechanics. 

As a final remark, we note that studying the geometric structure of mechanics, 
in recent years, has become important far beyond this discipline. Indeed, we know 
today that all fundamental interactions of nature carry strong geometric features. 
Once again, mechanics is the door to, and basis of, all of theoretical physics. In 
studying these geometric aspects of the fundamental interactions, we will, at times, 
turn back to mechanics where many of the essential building blocks are developed 
in a concrete and well understood framework. 
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5.1 Manifolds of Generalized Coordinates 



In Sect. 2.1 1 we showed that every diffeomorphic mapping of coordinates {q } onto 
new coordinates {q f } 



G : {q}\-> {q'} : q t = gj(q\ t),q = ^ ^ q k + 9 ‘ ? ' 

k= 1 dqk 



dt 



(5.1) 



leaves the equations of motion form invariant. This means, except for purely prac- 
tical aspects, any choice of a set of generalized coordinates {q} is as good as any 
other that is related to the first in a one-to-one and differentiable manner. The phys- 
ical system one wishes to describe is independent of the specific choice one makes, 
or, more loosely speaking, “the physics is the same”, no matter which coordinates 
one employs. It is obvious that the transformation must be uniquely invertible, or 
one-to-one, as one should not loose information in either direction. The number 
of independent degrees of freedom must be the same. Similarly, it is meaningful 
to require the mapping to be differentiable because we do not want to destroy or 
to change the differential structure of the equations of motion. 

Any such choice of coordinates provides a possible, specific realization of the 
mechanical system. Of course, from a practical point of view, there are appropri- 
ate and inappropriate choices, in the sense that the coordinates may be optimally 
adapted to the problem because they contain as many cyclic coordinates as possi- 
ble, or, on the contrary, may be such that they inhibit the solution of the equations 
of motion. This comment concerns the actual solution of the equations of motion 
but not the structure of the coordinate manifold into which the mechanical system 
is embedded. 

In mechanics a set of / generalized coordinates arises by constraining an ini- 
tial set of degrees of freedom by a number of independent, holonomic constraints. 
For instance, the coordinates of a system of N particles that are initially elements 
of an R 3W are constrained by A — 3 N — f equations, in such a way that the / 
independent, generalized coordinates, in general, are not elements of an R* . Let 
us recall two examples for the sake of illustration. 

(i) The plane mathematical pendulum that we studied in Sects. 1.17.2 and 2.30, 
Example (ii) has one degree of freedom. The natural choice for a generalized co- 
ordinate is the angle measuring the deviation from the vertical, cj = tp. As this 
coordinate takes values in the interval [— n, +n\, with q = it and q — — it to be 
identified, it is an element of the unit circle S 1 . The coordinate manifold is the S 1 , 
independent of how we choose q. (For instance, if we choose the arc q = s — bp, s 
is defined on the circle with radius l. This circle is topologically equivalent to S l .) 

(ii) The coordinate manifold of the rigid body (Chap. 3) provides another ex- 
ample. Three of the six generalized coordinates describe the unconstrained motion 
of the center of mass and are therefore elements of a space R 3 . The remaining 
three describe the spatial orientation of the top with respect to a system of ref- 
erence whose axes have fixed directions in space. They are angles and belong 
to the manifold of the rotation group SO(3). As we learnt earlier, this manifold 
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can be parametrized in different ways: for instance, by the direction about which 
the rotation takes place and by the angle of rotation («, <p), or, alternatively, by 
three Eulerian angles (0 (h , 6 * 3 ) . using one or the other of the definitions given in 
Sects. 3.9 and 3.10. We shall analyze the structure of this manifold in more detail 
below, in Sect. 5.2.3. Already at this point it seems plausible that it will turn out 
to be rather different from a three-dimensional Euclidean space and that we shall 
need further tools of geometry for its description. 




Fig. 5.1. Velocity field in the space of coordinates 
and their time derivatives for the one-dimensional 
harmonic oscillator 



Actual solutions of the equations of motion q(t.to,qo) — ‘I’t.to (<? o) (cf. 
Sect. 1.20) are curves in the manifold Q of coordinates. In this sense Q is the 
physical space that carries the real motion. However, in order to set up the equa- 
tions of motion and to construct their solutions, we also need the time derivatives 
Aq/At = q of the coordinates as well als Lagrangian functions L(q . q , t) over 
the space M of the q and the q. The Lagrangian function is to be inserted into 
the action integral I[q], functional of q(t), from which differential equations of 
second order in time follow via Hamilton’s variational principle (or some other 
extremum principle). For example, for / = 1 the physical solutions can be con- 
structed piecewise if one knows the velocity field. Figure 5.1 shows the example 
of the harmonic oscillator and its velocity field (cf. also Sect. 1.17.1). More gen- 
erally, this means that we shall have to study vector fields over M, and hence the 
tangent spaces T X Q of the manifold Q, for all elements x of Q. 

A similar remark applies to the case where, instead of the variables ( q , q), we 
wish to make use of the phase-space variables (q, p). We recall that p was defined 
to be the partial derivative of the Lagrangian function by q. 



Pi 



def 3L 

dq' 



(5.2) 



L being a (scalar) function on the space of the q and the q (it maps this space 
onto the real number), i.e. on the union of tangent spaces T X Q, definition (5.2) 
leads to the corresponding dual spaces T* Q, the so-called cotangent spaces. 
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These remarks suggest detaching the mechanical system one is considering 
from a specific choice of generalized coordinates { 17 } and to choose a more ab- 
stract formulation by defining and describing the manifold Q of physical motions 
in a coordinate-free language. The choice of sets of coordinates { 17 } or {q'} is 
equivalent to describing Q in terms of local coordinates, or, as one also says, in 
terms of charts. Furthermore, one is led to study various geometric objects living 
on the manifold Q, as well as on its tangent spaces T X Q and cotangent spaces 
T*Q. Examples are Lagrangian functions that are defined on the tangent spaces 
and Hamiltonian functions that are defined on the cotangent spaces, both of which 
give real numbers. 




Fig. 5.2. Physical motion takes place in the 
coordinate manifold Q. The Lagrangian 
function and the Hamiltonian function 
are defined on the tangent and cotangent 
spaces, respectively 



Figure 5.2 shows a first sketch of these interrelationships. As we shall learn 
below, there are many more geometric objects on manifolds other than functions 
(which are mappings to the reals). An example that we met earlier is vector fields 
such as the velocity field of a flow in phase space. In order to awake the reader’s 
curiosity, we just remind him of the Poisson brackets, defined on T* Q, and of the 
volume form that appears in Liouville’s theorem. 

An example of a smooth manifold, well known from linear algebra and from 
analysis, is provided by the /? -dimensional Euclidean space R". However, Eu- 
clidean spaces are not sufficient to describe general and nontrivial mechanical sys- 
tems, as is demonstrated by the examples of the coordinate manifolds of the plane 
pendulum and of the rigid body. As we shall see, the union of all tangent spaces 

TQ = {T x Q\x e Q} (5.3) 

and the union of all cotangent spaces 

T* Q = [T*Q\x e Q\ (5.4) 

are smooth manifolds. The former is said to be the tangent bundle , the latter the 
cotangent bundle. Suppose then that we are given a conservative mechanical sys- 
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tem, or a system with some symmetry. The set of all solutions lie on hypersurfaces 
in 2 /-dimensional space that belong to fixed values of the energy or are charac- 
terized by the conserved quantities pertaining to the symmetry. In general, these 
hypersurfaces are smooth manifolds, too, but cannot always be embedded in R 2 -^ . 
Thus, we must learn to describe such physical manifolds by mapping them, at 
least locally, onto Euclidean spaces of the same dimension. Or, when expressed 
in a more pictorial way, whatever happens on the manifold M is projected onto a 
set of charts, each of which represents a local neighborhood of M. If one knows 
how to join neighboring charts and if one has at one’s disposal a complete set of 
charts, then one obtains a true image of the whole manifold, however complicated 
it may look globally. 

The following sections (5.2—4) serve to define and discuss the notions sketched 
above and to illustrate them by means of a number of examples. From Sect. 5.5 
on we return to mechanics by formulating it in terms of a geometric language, 
preparing the ground for new insights and results. In what follows (Sects. 5. 2-5. 4) 
we shall use the following notation: 

Q denotes the manifold of generalized coordinates; its dimension is equal to /, 
the number of degrees of freedom of the mechanical system one is considering. 
M denotes a general smooth (and finite dimensional) manifold of dimension 
dim M — n. 



5.2 Differentiable Manifolds 

5.2.1 The Euclidean Space R" 

The definition of a differentiable manifold relates directly to our knowledge of 
the n-dimensional Euclidean space R'' . This space is a topological space. This 
means that it can be covered by means of a set of open neighborhoods that fulfills 
some quite natural conditions. For any two distinct points of R” one can define 
neighborhoods of these points that do not overlap: one says the R" is a Haus- 
dorff space. Furthermore, one can always find a collection B of open sets such 
that every open subset of R" is represented as the union of elements of B. Such 
a collection B is said to be a basis. It is even possible to construct a countable set 
of neighborhoods { U , } of any point p of R" such that for any neighborhood U 
of p there is an i for which iJ t is contained in U . These {[/;} can also be made 
a basis, in the sense defined above: thus the space R" certainly has a countable 
basis. All this is summarized by the statement that R" is a topological, Hausdorff 
space with a countable basis. 

It is precisely these requirements that are incorporated in the definition of a 
manifold. Even if they look somewhat complicated at first sight, these properties 
are very natural in all important branches of mechanics. Therefore, as a physi- 
cist one has a tendency to take them for granted and to assume tacitly that the 
spaces and manifolds of mechanics have these properties. The reader who wishes 
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to define matters very precisely from the start is consequently advised to consult, 
for example, the mathematical literature quoted in the Appendix and to study the 
elements of topology and set theory. 

The space R" has more structure than that. It is an n -dimensional real vec- 
tor space on which there exists a natural inner product and hence a norm. If 
p = (p lj p 2 , . . . , p n ) and q = (qi, q 2 , . . . , q n ) are two elements of R", the inner 
product and the norm are defined by 

n 

p ■ q = f Piqi and Vf • p , (5.5) 

i = 1 

respectively. Thus R" is a metric space. The distance function 

d{p,q) = f \p ~q\ (5.6) 

following from (5.5) has all properties that a metric should have: it is nondegener- 
ate, i.e. d(p, q) vanishes if and only if p = q\ it is symmetric d(q, p) — dip, q)\ 
and it obeys Schwarz’ inequality 

dip, r) < dip, q) + diq, r) . 

Finally, we know that on R" one can define smooth functions, 

/ : U c R” -» R , 

which map open subsets U of R" onto the real numbers. The smoothness, or C°° 
property, of a function / means that at every point u e U all mixed partial deriva- 
tives of / exist and are continuous. As an example consider the function /' which 
associates to every element p e R" its /th coordinate pi, as shown in Fig. 5.3, 

f : R" ->■ R : p = ipi pt , . . . , p„) i-> pt , i = 1, 2, . . . , n . (5.7) 

These functions f ip) — pi are said to be the natural coordinate functions of R". 



IR n 



p. =f' (p) 



Pi 



IP 



= f‘ (p) 



Fig. 5.3. The coordinate functions f l and f J assign to each 
point p of M n its coordinates p l and p J , respectively 
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5.2.2 Smooth or Differentiable Manifolds 

Physical manifolds like the ones we sketched in Sect. 5.1 often are not Euclidean 
spaces but topological spaces (Hausdorff with countable basis) that carry differen- 
tiable structures. Qualitatively speaking, they resemble Euclidean spaces locally, 
i.e. open subsets of them can be mapped onto Euclidean spaces and these "patches” 
can be joined like the charts of an atlas. 




Fig. 5.4. The chart mapping ip maps 
an open domain U of the manifold 
M homeomorphically onto a domain 
i p(U) of R", where n = dim M 



Let M be such a topological space and let its dimension be dimM = n. By 
definition, a chart or local coordinate system on M is a homeomorphism, 

<p:U C M -* <p(U) C R" , (5.8) 

of an open set U of M onto an open set <p(U) of R'', in the way sketched in 
Fig. 5.4. Indeed, applying the mapping (5.8) followed by the coordinate functions 
(5.7) yields a coordinate representation in R" 

x‘ = f' o<p or (pip) — (x l (p), , x"(p)) e R" (5.9) 

for every point p e U C M. This provides the possibility of defining a diversity 
of geometrical objects on U C M (i.e. locally on the manifold M), such as curves, 
vector fields, etc. Note, however, that this will not be enough, in general: since these 
objects are to represent physical quantities, one wishes to study them, if possible, 
on the whole of M. Furthermore, relationships between physical quantities must be 
independent of the choice of local coordinate systems (one says that the physical 
equations are covariant). This leads rather naturally to the following construction. 

Cover the manifold M by means of open subsets U , V, W, . . . , such that every 
point p e M is contained in at least one of them. For every subset U, V, ... choose 
a homeomorphism cp, xjr, . . ., respectively, such that U is mapped onto (p(U) in R", 
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Fig. 5.5. Two overlapping, open domains U and V on M, by the mappings ip and \!j . respectively, 
are mapped onto the open domains ip(U) and \!j ( V ) in two copies of R". Their region of overlap on 
M is mapped onto the hatched areas. The latter are related diffeomorphically through the transition 
mappings (ipot/r - ') or (x//o(p~ l ) 



V onto i jr( V) in R", etc. If U and V overlap partially on M, then also their images 
cp(U) and ^r(V) in R" will overlap partially, as shown in Fig. 5.5. The composed 
mapping <p o xfr _1 and its inverse t/r o cp~ l relate the corresponding portions of the 
images <p{U) and i //(V) (the hatched areas in Fig. 5.5) and therefore map an open 
subset of one R" onto an open subset of another R". If these mappings (up o t/r -1 ) 
and (xlro(p~ l ) are smooth, the two charts, or coordinate systems, ((p, U ) and (i Jr, V) 
are said to have smooth overlap. Obviously, this change of chart allows one to join 
U and V like two patches of M. Assuming this condition of smooth overlap to 
be trivially true, in the case where U and V do not overlap at all, provides the 
possibility of describing the entire manifold M by means of an atlas of charts. 

An atlas is a collection of charts on the manifold M such that 

Al. Every point of M is contained in the domain of at least one chart. 

A2. Any pair of two charts overlap smoothly (in the sense defined above). 

Before we go on let us ask what we have gained so far. Given such an atlas, 
we can differentiate geometric objects defined on M. This is done in the follow- 
ing way. One projects the objects onto the charts of the atlas and differentiates 
their images, which are now contained in spaces R", using the well-known rules 
of analysis. As all charts of the atlas are related diffeomorphically, this procedure 
extends to the whole of M. In this sense an atlas defines a differentiable structure 
on the manifold M. In other words, with an atlas at hand, it is possible to introduce 
a mathematically consistent calculus on the manifold M. 

There remains a technical difficulty, which, however, can be resolved easily. 
With the definition given above, it may happen that two formally different atlases 
yield the same calculus on M. In order to eliminate this possibility one adds the 
following to definitions Al and A2: 

A3. Each chart that has smooth overlap with all other charts shall be contained in 
the atlas. 
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In this case the atlas is said to be complete (or maximal). It is denoted by A. 
This completes the framework we need for the description of physical relationships 
and physical laws on spaces that are not Euclidean R" spaces. The objects, defined 
on the manifold M, can be visualized by mapping them onto charts. In this way, 
they can be subject to a consistent calculus as we know it from analysis in R". 

In summary, the topological structure is given by the definition of the mani- 
fold M, equipped with an atlas; the differential structure on M is fixed by giving 
a complete, differentiable atlas A of charts on M. Thus, a smooth, or differen- 
tiable, manifold is defined by the pair (M, A). We remark, in passing, that there 
are manifolds on which there exist different differentiable structures that are not 
equivalent. 

5.2.3 Examples of Smooth Manifolds 

Let us consider a few examples of differentiable manifolds of relevance for me- 
chanics. 

(i) The space R" is a differentiable manifold. The coordinate functions 
C/ 1 , / »'••> /") induce the identical mapping 

id : R" -> R" 

of R” onto itself. Therefore they yield an atlas on R" that contains a single chart. 
To make it a complete atlas, we must add the set i) of all charts on R" compatible 
with the identity id. These are the diffeomorphisms 0 : U -> 0(U) C R" on R". 
The differentiable structure obtained in this way is said to be canonical. 

(ii) A sphere of radius R in R 3 . Consider the sphere 

S R = f {x — Or 1 , x 2 , r 3 ) e R 3 |x 2 = (x 1 ) 2 + (x 2 ) 2 + (x 3 ) 2 = R 2 } . 

We may (but need not) think of it as being embedded in a space R 3 . An atlas 
that describes this two-dimensional smooth manifold in spaces R 2 must contain at 
least two charts. Here we wish to construct an example for them. Call the points 

N — (0, 0, R) , S = (0,0,-R) 

the north pole and south pole, respectively. On the sphere 5^ define the open sub- 
sets 



U : S 2 r - {A} and V d = S 2 R - {5} . 

Define the mappings < p : U R 2 , iff : V ► R 2 onto the charts as follows: <p 
projects the domain U from the north pole onto the plane x 3 = 0 through the 
equator, while xjr projects the domain V from the south pole onto the same plane 
(more precisely, a copy thereof), cf. Fig. 5.6. If p — (x ,x 2 ,x 3 ) is a point of U 
on the sphere, its projection is given by 
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N 




Fig. 5.6. One needs at least two charts for 
the description of the surface of a sphere. 
Here these charts are obtained by stere- 
ographic projection from the north and 
south poles, respectively 



(pip) = 



R 



R — x 3 



ix\ 



X 2 ) 



Taking the same point to be an element of the domain V, we see that its projection 
onto R 2 is given by 



V(P) = 



R 



R + x- 



-(x\x 2 ) . 



Let us then verify that (i// o <p 1 ) is a diffeomorphism on the intersection of the 
domains U and V. We have 



<piu n v) = R 2 - {0} = v(u n v) . 

Let y — (y 1 , y 2 ) be a point on the plane through the equator without origin 
y e R 2 — {0}. Its pre-image on the manifold is 

p = (p~ l iy) = O 1 = ^y\x 2 = Xy 2 ,x 3 ) , 



where X = (R — x 3 )/ R, x 3 being obtained from the condition X 2 u 2 + (x 3 ) 2 = R 2 , 
and where we have set u 2 — (y 1 ) 2 + (y 2 ) 2 . From this one finds 



u 2 - R 
2 + R 2 



and X — 2R 2 /{u 2 + R 2 ), from which one obtains, in turn, 

P = (P~ l iy ) = 2 * ( 2 R 2 y\ 2R 2 y 2 , R(u 2 - R 2 )) . 

u z + R- 

Applying the mapping if/ to this point and taking account of the relation 
R/(R + .x 3 ) = (u 2 + R 2 )/2u 2 , we find on R 2 — {0} 

v ° v~ l iy) = — (y 1 , y 2 ) • 

H Z 
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Clearly, this is a diffeomorphism from R 2 — {0} onto R 2 — {0}. The origin, which 
is the projection of the south pole by the first mapping, and is the projection of the 
north pole by the second, must be excluded. Hence the necessity of two charts. 

(iii) The torus T m . m -dimensional tori are the natural manifolds of integrable 
mechanical systems (see Sect. 2.37.2). The torus T m is defined as the product of 
m copies of the unit circle, 

T m = S 1 x S 1 x . , . x S 1 ( m factors) . 

For m = 2, for instance, it has the shape of the inner tube of a bicycle. The first S' 1 
goes around the tube, the second describes its cross section. The torus T 2 is also 
homeomorphic to the space obtained from the square {x, y|0 < x < 1, 0 < y < 1} 
by pairwise identification of the points (0, y) and (1, y ), and ( x , 0) and ( x , 1). An 
atlas for T 2 is provided, for example, by three charts defined as follows: 

<Pk\a k , fa) = (e i0 *, e iA ) e T 2 , k= 1, 2, 3 , 

where a\, fa e (0, 2n), 012 , fa e (— n, +7r), a 3, fa e (—n/2, 3n/2). 

Readers are invited to make a sketch of the torus and to convince themselves 
thereby that T 2 is indeed covered completely by the charts given above. 

(iv) The parameter manifold of the rotation group SO(3), which is the essen- 
tial part of the physical coordinate manifold of the rigid body, is a differentiable 
manifold. Here we wish to describe it in somewhat more detail. For this purpose 
let us first consider the group SU(2) of unitary (complex) 2x2 matrices U with 
determinant 1: 

{U complex 2x2 matrices |U 1 U = H , det U = 1 } . 

These matrices form a group, the unitary unimodular group in two complex di- 
mensions. U 1 denotes the complex conjugate of the transposed matrix, (U ’ ) pq = 
( Uq P )*. It is not difficult to convince oneself that any such matrix can be written 
as follows: 

U = a*) provided \a\ 2 + \b\ 2 = 1 . 

With the complex numbers a and b written as a = x l + ix 2 and b = x 2 + i.r 4 , 
the condition det U = 1 becomes 

Or 1 ) 2 + (x 2 ) 2 + (x 3 ) 2 + (x 4 ) 2 = 1 . 

If the x l are interpreted as coordinates in a space IR 4 , this condition describes 
the unit sphere S 3 embedded in that space. Let us parametrize the coordinates by 
means of angles u, v and w, as follows: 

x l — cos u cos v 
x 2 — cos u sin v u e [0 , tt/ 2] 
x 3 = sinn cos w v,we [0, 2n) 
x 4 — sin u sin w 
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such as to fulfill the condition on their squares automatically. Clearly, the sphere 
S 3 is a smooth manifold. Every closed curve on it can be contracted to a point, 
so it is singly connected. We now wish to work out its relation to SO(3). 

For this purpose we return to the representation of rotation matrices R e SO(3) 
by means of Eulerian angles, as defined in (3.35) of Sect. 3.9. Inserting the ex- 
pressions (2.71) for the generators and multiplying the three matrices in (3.35), 
one obtains 



/ cos y cos p cos a cos y cos /S sin a — cos /sin /3\ 

— sin y sin a + sin y cos a 



R(a, ft, y) = 



— sin y cos /8 cos a 
— cos y sin a 

sin/1 cos a 



V 



— sin y cos p sin a sin y sin / 
+ cos y cos a 



sin p sin a 



cos ft J 

In the next step we define the following map from S 3 onto SO(3): 



/ : S i SO(3) 



by 



y = v + w (mod 2 n) 

P = 2 u 

a = v — w (mod27r) 

As a and y take values in [0, 2jt) and /I takes values in [0, 7 r], the mapping is 
surjective. We note the following relations between matrix elements of R and the 
angles u, v, w: 

/? 33 = C0S(2 u) , 

/? 31 = yji — /?| 3 COS(U — w) , Rl 3 — —Jl — R 33 cos(u + w) , 

R 32 — J 1 — Rj 3 sin(u — w) , R 23 = J 1 — R ^3 sin(u + w) . 

(The ramaining entries, not shown here, are easily derived.) 

Consider a point x e S 3 , x(u, v. w ) and its antipodal point x' — —x, 
which is obtained by the choice of parameters u' = u, v’ — v + 7r(mod27r), 
w' = w + it (mod 2n). These two points have the same image in SO(3) be- 
cause y' — v' + w' — v + w + 27r (mod 27T ) — y + 27r(mod27r); similarly, 
a' = a + 27T(mod 2 tt), while fi' = j3. Thus, the manifold of SO(3) is the im- 
age of .S' 3 , but x and —x are mapped onto the same element of SO(3). In other 
words, the manifold of the rotation group is S 3 with antipodal points identified. 
If opposite points on the sphere are to be identified then there are two distinct 
classes of closed curves: (i) those which return to the same point and which can 
be contracted to a point, and (ii) those which start in x and end in —x and which 
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cannot be contracted to a point. This is equivalent to saying that the manifold of 
SO(3) is doubly connected. 

As a side remark we point out that we have touched here on a close relationship 
between the groups SU(2) and SO(3) that will turn out to be important in describ- 
ing intrinsic angular momentum (spin) in quantum mechanics. The manifold of 
the former is the (singly connected) unit sphere S 3 . 



5.3 Geometrical Objects on Manifolds 

Next, let us introduce various geometrical objects that are defined on smooth man- 
ifolds and are of relevance for mechanics. There are many examples: functions 
such as the Lagrangian and Hamiltonian functions, cur\>es on manifolds such as 
solution curves of equations of motion, vector fields such as the velocity field of a 
given dynamical system, forms such as the volume form that appears in Liouville’s 
theorem, and many more. 

We start with a rather general notion: mappings from a smooth manifold M 
with atlas A onto another manifold N with atlas B (where N may be identical 
with M ): 

F : (M,A) -+ (N,B) . (5.10) 

The point p, which is contained in an open subset U of M, is mapped onto the 
point F(p) in N, which, of course, is contained in the image F(U) of U . 

Let m and n be the dimensions of M and N, respectively. Assume that {up, U ) 
is a chart from the atlas A, and ( i // , V ) a chart from B such that F(U) is contained 
in V. The following composition is then a mapping between the Euclidean spaces 
R m and R": 

iroFocp- 1 : <p(U) C R'" -» f(V) C R" . (5.11) 

At this level it is meaningful to ask the question whether this mapping is continu- 
ous or even differentiable. This suggests the following definition: the mapping F 
(5.10) is said to be smooth, or differentiable, if the mapping (5.1 1) has this property 
for every point p e U C M, every chart (<p, U ) e A, and every chart {fir, V) e B, 
the image F(U) being contained in V . 

As we shall soon see, we have already met mappings of the kind (5.10) on 
several occasions in earlier chapters, although we did not formulate them in this 
compact and general manner. This may be clearer if we notice the following spe- 
cial cases of (5.10). (i) The manifold from which F starts is the one-dimensional 
Euclidean space (R, d), e.g. the time axis R r . The chart mapping <p is then simply 
the identity on R. In this case the mapping F (5.10) is a smooth curve on the 
manifold ( N , B), e.g. physical orbits, (ii) The manifold to which F leads is R, i.e. 
now the chart mapping i/r is the identity. In this case F is a smooth function on M, 
an example being provided by the Lagrangian function, (iii) Initial and final man- 
ifolds are identical. This is the case, for example, for F being a diffeomorphism 
of M. 
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5.3.1 Functions and Curves on Manifolds 



A smooth function on a manifold M is a mapping from M to the real numbers, 

f :M -+WL: peM i-> f(p) e R , (5.12) 

which is differentiable, in the sense defined above. 

An example is provided by the Hamiltonian function H, which assigns a real 
number to each point of phase space P, assmuming H to be independent of time. 
If H has an explicit time dependence, it assigns a real number to each point of 
PxK f , the direct product of phase space and time axis. As another example con- 
sider the charts introduced in Sect. 5.2.2. The mapping x l = /' o q> of (5.9), with 
the function /' as defined in (5.7), is a function on M. To each point p e U C M 
it assigns its ith coordinate in the chart (<p, U). 

The set of all smooth functions on M is denoted by T(M). 

In Euclidean space R" the notion of a smooth curve y( r) is a familiar one. 
When understood as a mapping, it leads from an open interval I of the real axis 
R (this can be the time axis R,, for instance) to the R", 

y:/cR-*R":re/i-* y(r) e R" . (5.13a) 

Here, the interval may start at — oo and/or may end at +oo. If { e, } is a basis of 
R", then y( r) has the decomposition 

n 

y(r) = ^y'We,- . (5.13b) 

1=1 

On an arbitrary smooth manifold N smooth curves are defined following the gen- 
eral case (5.10), by considering their image in local charts as in (5.11), 

y : I C R -> N : r e I y(r) e N . (5.14) 



Let (fr, V ) be a chart on N. For the portion of the curve contained in V . the com- 
position fo y is a smooth curve in R" (take (5.11) with <p — id). As N is equipped 
with a complete atlas, we can follow the curve everywhere on N, by following it 
from one chart to the next. 

We wish to add two remarks concerning curves and functions that are impor- 
tant for the sequel. For the sake of simplicity we return to the simpler case (5.13a) 
of curves on Euclidean space R". 

(i) Smooth curves are often obtained as solutions of first-order differential equa- 
tions. Let to be contained in the interval /, and let po — y(r o) e R" be the point 
on the curve reached at “time” tq. If we take the derivative 



K(r) 



d y(r) 
dr 



then 
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n 

K(r o) = X! x'(To)e,- = f v po 
1=1 

is the vector tangent to the curve in po. Now, suppose we draw all tangent vectors 
v p in all points p e y( r) of the curve. Clearly, this reminds us of the stepwise con- 
struction of solutions of mechanical equations of motion. However, we need more 
than that: the tangent vectors must be known in all points of an open domain in R" , 
not just along one curve y(r). Furthermore, the field of vectors obtained in this 
way must be smooth everywhere where it is defined, not just along the curve. y( x) 
is then one representative of a set of solutions of the first-order differential equation 



6l(r) — Va(r) ■ 



(5.15) 



As an example, consider a mechanical system with one degree of freedom: the 
one-dimensional harmonic oscillator. From Sect. 1.17.1, let 



H= \(p 2 + q 2 ) ■ 



The equation of motion reads x = J H x = Xu, with 



Y ( dH/dp\_ 

- H ~~ 1-3 H/dq) ~ 



P 

-q 



x is a point in the two-dimensional manifold N — Mr; the vector field X\i is 
said to be the Hamiltonian vector field. The solutions of the differential equation 
x — Ah (5.15) 



x(r) = 0 T _ ro (-?o) 



/ q 0 cos(r - to) + po sin(r - r 0 )\ 
\-<7o sin(r - r 0 ) + po cos(r - r 0 ) J 



are curves on N, each of which is fixed by the initial condition 



-5(ro) = -?o = 




(ii) Let y be an arbitrary, but fixed, point of R". We consider the set T y M n of 
all tangent vectors at the point (i.e. vectors that are tangent to all possible curves 
going through y), as shown in Fig. 5.7. As one can add these vectors and can mul- 
tiply them with real numbers, they form a real vector space. (In fact, one can show 
that this vector space 7\,R" is isomorphic to R", the manifold that we consider. 
Therefore, in this case, we are justified in drawing the vectors v in the same space 
as the curves themselves, see Fig. 5.7.) 

Consider a smooth function f(x) on R” (or on some neighborhood of the 
point y) and a vector v — ^ v 1 et of 7\,R", and take the derivative of f(x) at the 
point y, in the direction of the tangent vector v. This is given by 
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Fig. 5.7. The vectors tangent to all possible smooth curves 
through a given point y of R n span a vector space, the tangent 
space TyW 1 



1 = 1 



I x=y 



(5.16) 



This directional derivative assigns to each function f(x ) e T (R" ) a real number 
given by (5.16), 

v : F(R n ) -> R : / h* v(f) . 

This derivative has the following properties: if f(x) and g (x ) are two smooth 
functions on R", and a and h two real numbers, then 

VI. v(af + bg) = av(f) + bv(g) (R-linearity) , (5.17) 

V2. v(fg) = v(f)g(y) + f (y)v(g) ( Leibniz’ rule ) . (5.18) 



5.3.2 Tangent Vectors on a Smooth Manifold 

Thinking of a smooth, two-dimensional manifold M as a surface embedded in R 3 , 
we can see that the tangent vectors at the point y of M are contained in the plane 
through y tangent to M. This tangent space 7’ v M is the Euclidean space R 2 . This is 
true more generally. Let M be an n -dimensional hypersurface embedded in R' !+1 . 
T y M is a vector space of dimension n, isomorphic to R". Any element of T y M 
can be used to form a directional derivative of functions on M. These derivatives 
have properties VI and V2. 

In the case of an arbitrary, abstractly defined, smooth manifold, it is precisely 
these properties which are used in the definition of tangent vectors: a tangent vector 
v in the point p e M is a real-valued function 



v : T(M) R 


(5.19) 


that has properties VI and V2, i.e. 




v(af + bg ) = av(f) + bv(g ) , 


(VI) 


v(fg) = v(f)g(p) + f(p)v(g) , 


(V2) 
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where /, g e T ( M ) and a, b e R. The second property, in particular, shows that 
v acts like a derivative. This is what we expect from the concrete example of Eu- 
clidean space R". The space T p M of all tangent vectors in p e M is a vector 
space over R, provided addition of vectors and multiplication with real numbers 
are defined as usual, viz. 

Oi + V 2 )(f ) = Vi(f) + v 2 {f) , 

(5.20) 

(av)(f) = av(f ) , 

for all functions / on M and all real numbers a. This vector space has the same 
dimension as M. 

In general, one cannot take a partial derivative of a function g e T ( M ) on M 
itself. However, this is possible for the image of g in local charts. Let {up, U) be 
a chart, p e U a point on M, and g a smooth function on M. The derivative of 
g o cp~ l with respect to the natural coordinate function f (5.7), which is taken at 
the image (flip) in R", is well defined. It is 



3; 



( g ) = 



3 g 

dx' 



def 3 C? °<P ) 

" 3 /'' 






The functions 

3 



3 ; 



dx‘ 



: T(M) 



: 8 



3 8 

dx 1 



i — 1,2 ,...,« 



(5.21) 



(5.22) 



have properties VI and V2 and hence are tangent vectors to M, at the point 
p e U c M. 

The objects defined in (5.22) are useful in two respects. Firstly, they are used 
to define partial derivatives of smooth functions g on M, by projecting g onto a 
Euclidean space by means of local charts. Secondly, one can show that the vectors 



3l 1/7, d 2 \p, ■ • ■ , 3 n\p 

form a basis of the tangent space T p M (see e.g. O’Neill 1983), so that any vector 
of T p M has the representation 

n 

u = ^u(x ! )3 ; '| p (5.23) 

i = 1 

in local charts, x' being the coordinates defined in (5.9). 

We now summarize our findings. A vector space T p M is pinned to each point 
p of a smooth, but otherwise arbitrary, manifold M. It has the same dimension 
as M and its elements are the tangent vectors to M at the point p. If (<p. U) is a 
chart on M that contains p, the vectors 3,jp, i = 1 defined in (5.22), form 
a basis of T p M , i.e. they are linearly independent and any vector v of T p M can 
be represented as a linear combination of them. 
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5.3.3 The Tangent Bundle of a Manifold 

All points p,q,r, ... of a smooth manifold M possess their own tangent spaces 
T p M, T q M, T r M, . . . Although these spaces all have the same dimension they are 
different from each other. For this reason one usually draws them, symbolically, 
as shown in Fig. 5.8, in such a way that they do not intersect. (Flad they been like 
tangents to M, they would seem to intersect.) 



Fig. 5.8. The set of all tangent 
spaces at the points p,q, . . . of the 
manifold M, is the tangent bundle 
T M of M 

One can show, without too much difficulty, that the (disjoint) union of all tan- 
gent spaces 

TM = f (J T p M (5.24) 

psM 

is again a smooth manifold. This manifold TM is said to be the tangent bundle, 
M being the base space and the tangent spaces T p M being the fibres. If M has 
dimension dim M = n, the tangent bundle has dimension 

dim TM — 2n . 

Figure 5.8 exhibits symbolically this fibre structure of TM. Very much like the 
basis itself, the manifold T M is described by means of local charts and by means 
of a complete atlas of charts. In fact, the differentiable structure on M induces 
in a natural way a differentiable structure on T M. Without going into the pre- 
cise definitions at this point, qualitatively we may say this: each chart (<p. U) is a 
differentiable mapping from a neighborhood U of M onto the R" . Consider then 

def 

TU — U pe u TpM, i.e. the open subset TU of TM, which is defined once U is 
given. The mapping <p from U to R" induces a mapping of the tangent vectors in 
p onto tangent vectors in (pip), the image of p ; 

Tip = TU -> i p(U) x R" . 

This mapping is linear and it has all the properties of a chart (we do not show 
this here, but refer to Sect. 5.4.1 below, which gives the definition of the tangent 
mapping). As a result, each chart (< p, U) from the atlas for M induces a chart 
(Tcp, TU) for T M. This chart is said to be the bundle chart associated with (<p, U). 
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A point of TM is characterized by two entries 

(p, u) with p e M and i > e T p M , 

i.e. by the base point p of the fibre T p M and by the vector v, an element of this vec- 
tor space. Furthermore, there is a natural projection from TM to the base space M, 

it : TM — > M : (p, v) i — y p , p e M, v e T p M . (5.25) 

To each element in the fibre T p M it assigns its base point p. 

Lagrangian mechanics provides a particularly beautiful example for the concept 
of the tangent bundle. Let Q be the manifold of physical motions of a mechani- 
cal system and let u be a point of Q, which is represented by coordinates \q } in 
local charts. Consider all possible smooth curves y( r) going through this point, 
with the orbit parameter always being chosen such that u — y(0). The tangent 
vectors v„ = y(0), which appear as {</} in charts, span the vector space T U Q. 
The Lagrangian function of an autonomous system is defined locally as a function 
L(q,q), where q is an arbitrary point in the physical manifold Q, while q is the 
set of all tangent vectors at that point, both being written in local charts of T Q. 
It is then clear that the Lagrangian function is a function on the tangent bundle, 
as anticipated in Fig. 5.2, 

L : TQ -> R . 

It is defined in points ( p , v) of the tangent bundle T Q, i.e. locally it is a function of 
the generalized coordinates q and the velocities q. It is the postulate of Hamilton’s 
principle that determines the physical orbits q(t) = 0 ( 1 ) via differential equations 
obtained by means of the Lagrangian function. We return to this in Sect. 5.5. 

As a final remark in this subsection, we point out that T M locally has the 
product structure MxR". However, its global structure can be more complicated. 

5.3.4 Vector Fields on Smooth Manifolds 

Vector fields of the kind sketched in Fig. 5.9 are met everywhere in physics. For a 
physicist they are examples of an intuitively familiar concept: flow fields of a liq- 
uid, velocity fields of swarms of particles, force fields, or more specifically within 
canonical mechanics, Hamiltonian vector fields. In the preceding two sections we 
considered all possible tangent vectors v p e T p M in a point p of M. The concept 
of a vector field concerns something else: it is a prescription that assigns to each 
point p of M precisely one tangent vector V p taken from T p M l . For example, 
given the stationary flow of a liquid in a vessel, the flow velocity at each point 
inside the vessel is uniquely determined. At the same time it is an element of the 

1 In what follows we shall often call V p , i.e. the restriction of the vector field to TpM , a repre- 
sentative of the vector field. 
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Fig. 5.9. Sketch of a smooth vector field on the 
manifold M 



tangent space that belongs to this point. In other words, at every point the flow 
field chooses a specific vector from the vector space pertaining to that point. 

These general considerations are cast into a precise form by the following def- 
inition. 



VF1. A vector field V on the smooth manifold M is a function that assigns 
to every point p of M a specific tangent vector V p taken from the vector 
space T p M : 

V : M -> T M : p e M i-> V p e T p M . (5.26) 

According to (5.19) tangent vectors are applied to smooth functions on M 
and yield their generalized directional derivatives. In a similar fashion, vec- 
tor fields act on smooth functions, 

V : T(M) -+ T(M) , 

by the following rule: at every point p e M the representative V p of the 
vector field V is applied to the function, viz. 

(Vf)(p) = v p {f) , feT(M). (5.27) 

This rule allows us to define smoothness of vector fields, as follows. 
VF2. The vector field V is said to be smooth if V f is smooth, for all smooth 
functions / on M. 



The vector field V leads from M to T M by assigning to each p e M the 
element ( p. V p ) of T M. Applying the projection n, as defined in (5.25), to this 
element yields the identity on M. Any such mapping 

cr : M -> TM 

that has the property n o o — id m is said to be a section in T M. Hence, smooth 
vector fields are differentiable sections. 
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In a chart (<p. JJ), i.e. in local coordinates, a vector field can be represented 
locally by means of coordinate vector fields, or base fields. For every point p of 
an open neighborhood U C M, the base field 3,j p , according to (5.22), is defined 
as a vector field on U : 



dj : U -> TU : p e U i-> 9,-| p . (5.28) 

As the functions (goip~ l ), which appear in (5.21), are differentiable, it is clear that 
dj is a smooth vector field on U. As in Sect. 5.2.2 let us denote the chart mapping 
by (pip) — (x 1 (p ), . . . , x n (p)). Any smooth vector field V defined on U C M 
has the local representation 

n 

V = ^(Vx'R (5.29) 

i=i 

on U. Finally, by joining together these local representations on the domains of 
charts U , V, ... of a complete atlas, we obtain a patchwise representation of the 
vector field that extends over the manifold as a whole. The base fields on two 
contiguous, overlapping domains U and V of the charts (<p, U) and (i/r, V ), re- 
spectively, are related as follows. Returning to (5.21) and making use of the chain 
rule, one has 



d(go(p b 

df 



n 



E 



d(g o f x ) di\j/ k o <p L ) 

W k W l 



Denoting the derivatives (5.21) by df\ p and df\ , i.e. by indicating the chart map- 
ping q> or i/f as a superscript, we find in the overlap of U and V 



d- 






k= 1 



di\// k o (p : ) 
~df‘ 



(5.30) 



The matrix appearing on the right-hand side is the Jacobi matrix -i of the 
transition mapping (i/r o cp~ l ). 

The set of all smooth vector fields on M is usually denoted by X (M) or V(M). 
We already know an example from Sect.5.3.1: the Hamiltonian vector field on a 
two-dimensional phase space. If x 1 = q and x 2 = p denote local coordinates, then 



dH d H 

Xh = — 9i - — 82 , 
dp 3 q 



so that v' = Zh(.v') gives the vector field of that example. 

According to (5.19) a tangent vector v of T p M assigns to each smooth function 
/ a real number. In the case of a vector field this statement applies to every point 
p of M, cf. (5.27). When we consider this equation as a function of p, we see 
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that the action of the field V on the function / yields another smooth function on 
M, 



V e V(M) : f e T(M) -*Vf e T(M) 

: f(p) V p (f) . (5.31) 

This action of V on functions in IF ( M ) has the properties VI and V2 of Sect. 5.3.2, 
i.e. V acts on / like a derivative. Therefore, vector fields can equivalently be un- 
derstood as derivatives on the set T(M) of smooth functions on M 2 . 

Starting from this interpretation of vector fields one can define the commutator 
of two vector fields X and Y of V(M), 

Z = [X, Y] = f XY-YX. (5.32) 

X or Y . when applied to smooth functions, yield again smooth functions. Therefore, 
as X(Yf) and Y(Xf ) are functions, the action of the commutator on f is given 
by 



Zf = X(Yf ) - Y(Xf) . 

One verifies by explicit calculation that Z fulfills VI and V2, and, in particular, 
that Z(fg ) = (Zf)g + f(Zg). In doing this calculation one notices that it is 
important to take the commutator in (5.32). Indeed, the mixed terms ( Xf)(Y g ) 
and ( Yf)(Xg ) only cancel by taking the difference (XY — YX). As a result, the 
commutator is again a derivative for smooth functions on M, or, equivalently, the 
commutator is a smooth vector field on M. (This is not true for the products XY 
and YX.) For each point p e M (5.32) defines a tangent vector Z p in T p M given 
by Z p (f) = X p (Yf)-Y p (Xf). 

The commutator of the base fields in the domain of a given chart vanishes, 
[3/ , 3/t] = 0. This is an expression of the well-known fact that the mixed, second 
partial derivatives of smooth functions are symmetric. Without going into the de- 
tails, we close this subsection with the remark that [X, Y] can also be interpreted 
as the so-called Lie derivative of the vector field Y by the vector field X. What this 
means can be described in a qualitative manner as follows: a vector field X defines 
a flow, through the collection of solutions of the differential equation a (r) = A„( r) , 
as in (5.15). One can ask the question, given certain differential-geometric objects 
such as functions, vector fields, etc., how these objects change along the flow of 
X. In other words, one takes their derivative along the flow of a given vector field 
X. This special type of derivative is said to be the Lie derivative; in the general 
case it is denoted by Lx- If acting on vector fields, it is Lx — [ V. Y], It has the 
following property: L\x,y\ — [Lx, Ly], to which we shall return in Sect. 5.5.5. 



- The precise statement is this: the real vector space of R-linear derivations on T(M) is isomorphic 
to the real vector space V(M). 
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5.3.5 Exterior Forms 



Let y be a smooth curve on the manifold M , 
y = {y 1 y n j : I c R ->• M , 



that goes through the point p e M such that p — y (r = 0). Let / be a smooth 
function on M. The directional derivative of this function in p, along the tangent 
vector v p — j> (0) , is given by 



d fp(v P ) 



d7 /(K(r)) 



r=0 



(5.33) 



This provides an example for a differentiable mapping of the tangent space T p M 
onto the real numbers. Indeed, 



df p : T p M -> R 

assigns to every v p the real number d/(y (r))/dr | T= o- This mapping is linear. As 
is well known from linear algebra, the linear mappings from T p M to R span the 
vector space dual to T p M. This vector space is denoted by T*M and is said to 
be the cotangent space (cotangent to M) at the point p. The disjoint union of the 
cotangent spaces over all points p of M, 

1J T*M = f T*M , (5.34) 

peM 

finally, is called the cotangent bundle, in analogy to the tangent bundle (5.24). Let 
us denote the elements of T*M by co p . Of course, in the example (5.33) we may 
take the point p to be running along a curve y (r), or, more generally, if there is 
a set of curves that cover the whole manifold, we may take it to be wandering ev- 
erywhere on M. This generates something like a “field” of directional derivatives 
everywhere on M that is linear and differentiable. Such a geometric object, which 
is, in a way, dual to the vector fields defined previously, is said to be a differential 
form of degree 1, or simply a one-form. Its precise definition goes as follows. 



DF1. A one-form is a function 

co : M -» T*M : p i-> co p € T*M (5.35) 

that assigns to every point p e M an element co p in the cotangent space 
T*M. Here, the form co p is a linear mapping of the tangent space T p M 
onto the reals, i.e. a) p (v p ) is a real number. 



Since co acts on tangent vectors v p at every point p, we can apply this one- 
form to smooth vector fields X: the result co ( X ) is then a real function whose 
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value in p is given by o> p (X p ). Therefore, definition DF1 can be supplemented 
by a criterion that tests whether the function obtained in this way is differentiable, 
viz. the following. 

DF2. The one-form u> is said to be smooth if the function a> ( X ) is smooth 
for any vector field X e V(M). 

The set of all smooth one-forms over M is often denoted by X* (M), the no- 
tation stressing the fact that it consists of objects that are dual to the vector fields, 
denoted by X(M) or V(M). 

An example of a smooth differential form of degree 1 is provided by the dif- 
ferential of a smooth function on M, 

d / : TM ^ R , (5.36) 

which is defined such that (d f)(X) = X (/). For instance, consider the chart map- 
ping (5.9), 

<P(P) = {x\p),...,x n (p)) , 

where the x l (p) are smooth functions on M. The differential (5.36) of x l in the 
neighborhood U C M, for which the chart is valid, is 

d.v' : TU -»■ R. . 

Let v = (v(x ] ), . . . , v(x n )) be a tangent vector taken from the tangent space T p M 
at a point p of M. Applying the one-form d.v' to v yields a real number that is 
just the component v(x') of the tangent vector, 

d.v'(r) = v(x‘) . 

This is easily understood if one recalls the representation (5.23) of v in a local 
chart and if one calculates the action of d.v' on the base vector 3 j\ p (5.22). One 
finds, indeed, that 

dx i (d j \ p )= — d x i =8). 

dxJ p ■' 

With this result in mind one readily understands that the one-forms dx' form a 
basis of the cotangent space T*M at each point p of M. The basis { d.v' | ; , } of 
T*M is the dual of the basis {3,j p } of T p M. The one-forms dx 1 , . . . , d.v" are said 
to be base differential forms of degree 1 on U . This means, in particular, that any 
smooth one-form can be written locally as 

n 

m — ^ aj(dj)dx‘ . 
i= 1 



(5.37) 
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Here co(di) at each point p is the real number obtained when applying the one- 
form 0 ) onto the base field 3 see DF1 and DF2. The representation is valid on 
the domain U of a given chart. As the manifold M can be covered by means of 
the charts of a complete atlas and as neighboring charts are joined together dif- 
feomorphically, one can continue the representation (5.37) patchwise on the charts 
ipp, U), (i/f, V), etc. everywhere on M. 

As an example, consider the total differential of a smooth function g on M, 
where M is a smooth manifold described by a complete atlas of, say, two charts. 
On the domain U of the first chart (<p, U) we have dg(3,) = dg/dx‘, and hence 
dg = Similarly, on the domain V of the second chart (xj/, V), 

dg(3,) = dg/dy‘ and dg = ^" = i(3g/3y ! )dy ! . On the overlap of U and V either 
of the two local representations is valid. The base fields on U and those on V are 
related by the Jacobi matrix, cf. (5.30), while the base forms are related by the 
inverse of that matrix. 

Let us summarize the dual concepts of vector fields and one-forms. As indi- 
cated in VF1 the vector field X chooses one specific tangent vector X p from each 
tangent space T p M at the point p of M . This representative X p acts on smooth 
functions in a differentiable manner, according to the rules VI and V2. The base 
fields {3, } are special vector fields that are defined locally, i.e. chartwise. The one- 
form cu, on the other hand, assigns to each point p a specific element co p from 
the cotangent space T*M. Thus, o> p is a linear mapping acting on elements X p 
of T p M. As a whole, a> ( X ) is a smooth function of the base point p. The set 
of differentials dx‘ are special cases of one-forms in the domains of local charts. 
They can be continued all over the manifold M, by going from one chart to the 
next. The set {d.r'| p } is a basis of the cotangent space T*M that is dual to the 
basis { 3/ 1 p } of the tangent space T p M. 



5.4 Calculus on Manifolds 

In this last of the preparatory sections we show how to generate new geometrical 
objects from those studied in Sect. 5.3 and how to do calculations with them. We 
introduce the exterior product of forms, which generalizes the vector product in 
R 3 , as well as the exterior derivative , which provides a systematic generalization 
of the notions gradient, curl, and divergence, familiar from calculus in the space 
]R 3 . We also briefly discuss integral curves of vector fields and, thereby, return to 
some of the results of Chap. 1. In this context, the central concepts are again those 
of smooth mapping of a manifold onto another (or itself) and the linear transfor- 
mations of the tangent and cotangent spaces induced by the mapping. 

5.4.1 Differentiable Mappings of Manifolds 

In Sect. 5.3 we defined smooth mappings 

F : (M, A) -> (N, B) (5.38) 
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from the manifold M with differentiable structure A onto the manifold N with 
differentiable structure B. Differentiability was defined by means of charts and in 
Euclidean spaces, as indicated in (5.11). It is not difficult to work out the trans- 
formation behavior of geometrical objects on M, under the mapping (5.38). For 
functions this is easy. Let / be a smooth function on the target manifold N, 

f : N -^R:q e (V h* f(q) e R . 

If q is the image of the point p e M by the mapping F, i.e. q — F(p), then the 
composition (/ o F) is a smooth function on the starting manifold M. It is said 
to be the pull-back of the function /, i.e. the function / on N is "pulled back” 
to the manifold M , where it becomes (/ o F). This pull-back by the mapping F 
is denoted by F*, 

F*f = f°F:peMt-> f(F(p)) e R . (5.39) 

Thus, any smooth function that is given on N can be carried over to M. The con- 
verse, i.e. the push-forward of a function from the starting manifold M to the target 
manifold N is possible only if F is invertible and if F~ 1 is smooth as well. For 
example, this is the case if F is a diffeomorphism. 

By (5.38), vector fields on M are mapped onto vector fields on N. This is seen 
as follows. Vector fields act on functions, as described in Sect. 5.3.4. Let X be a 
vector field on M, X p its representative in T p M, the tangent space in p e M, and 
g a smooth function on the target manifold N . As the composition (g o F) is a 
smooth function on M, we can apply X p to it, X p (g o F). If this is understood 
as an assignment 

(X F ) q :geT(N)^X p (goF)eR, 

then ( Xp)q is seen to be a tangent vector at the point q — F(p) on the target mani- 
fold. For this to be true, conditions VI and V2 must be fulfilled. VI being obvious, 
we only have to verify the Leibniz rule V2. For two functions / and g on N the 
following equation holds at the points p e M and q — F(p) e N , respectively, 

X F {fg) = X{{foF){goF)) 

= X(f o F)g(F(p )) + f (F(p))X(g o F) 

= X F (f)g(q) + f (cj)X F (g) . 

This shows that ( X F ) q is indeed a tangent vector belonging to T q N . In this way, 
the differentiable mapping F induces a linear mapping of the tangent spaces, which 
is said to be the differential mapping d F corresponding to F. The mapping 

dF : TM TN : X \-+ X F (5.40a) 

is defined at every point p and its image q by 3 

3 Below we shall also use the notation T F, instead of d F, a notation which is customary in the 
mathematical literature. 
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d F p : T p M T q N : X p h* (X» 9 , q = F(p) . (5.40b) 

Its action on functions / e T(N) is 

d F p (X)(f) = X(f o F), X e V(M), / e f(N) . 

In Fig. 5.10 we illustrate the mapping F and the induced mapping d F. As a 
matter of exception, we have drawn the tangent spaces in p and in the image point 
q — F(p) as genuine tangent planes to M and N , respectively. We note that if F 
is a diffeomorphism, in particular, then the corresponding differential mapping is 
a linear isomorphism of the tangent spaces. (For an example see Sect. 6.2.2.) 



TpM 




Fig. 5.10. The smooth mapping F from M to N induces a linear mapping dF (or T F ) of the 
tangent space TpM onto the tangent space TqN in the image q = F(p ) of p 



Given the transformation behavior of vector fields, we can deduce the trans- 
formation behavior of exterior differential forms as follows. Let a> e X*(N) be 
a one-form on N. As we learnt earlier, it acts on vector fields defined on N. As 
the latter are related to vector fields on M via the mapping (5.40a), one can “pull 
back” the form co on N to the starting manifold M. The pull-back of the form a>, 
by the mapping F, is denoted by F*a>. It is defined by 

(F*w)(Z) = co(dF(X)) , X € V(M) . (5.41) 

Thus, on the manifold M the form F*o> acts on X and yields a real function on 
M whose value in p e M is given by the value of the function &>(d.F(20) in 
q = F(p). 



5.4.2 Integral Curves of Vector Fields 

In Sects. 1.16, 1.18-20, we studied the set of solutions of systems of first-order 
differential equations, for all possible initial conditions. In canonical mechanics, 
these equations are the equations of Hamilton and Jacobi, in which case the right- 
hand side of (1.41) contains the Hamiltonian vector field. Smooth vector fields 
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and integral curves of vector fields are geometrical concepts that occur in many 
areas of physics. We start by defining the tangent field of a curve a on a manifold 
M. The curve a maps an interval I of the real r-axis R z onto M. The tangent 
vector field on R T is simply given by the derivative d/dr. From (5.40a), the linear 
mapping da maps this field onto the tangent vectors to the curve a on M. This 
generates the vector field 

. def . d 
a = da o — , 
dr 

tangent to the curve a : I —> M. On the other hand, for an arbitrary smooth vector 
field X on M, we may consider its representatives in the tangent spaces 7„ (r j M 
over the points on the curve. In other words, we consider the vector field X a ( z) 
along the curve. Suppose now that the curve a is such that its tangent vector field 
a coincides with X a ( T ). If this happens, the curve a is said to be an integral curve 
of the vector field X. In this case we obtain a differential equation for a(r), viz. 



a — X o a or a(r) = X a ( T ) for all r el. 



(5.42) 



When written out in terms of local coordinates, this is a system of differential 
equations of first order, 

— (x 1 o a) = X' (x 1 o a, . . . , x n o a) , (5.43) 

dr 

which is of the type studied in Chap. 1; cf. (1.41). (Note, however, that the right- 
hand side of (5.43) does not depend explicitly on r. This means that the flow of 
this system is always stationary.) In particular, the theorem of Sect. 1.19 on the 
existence and uniqueness of solutions is applicable to the system (5.43). 

Let us consider an example: the Hamiltonian vector field for a system with one 
degree of freedom, i.e. on a two-dimensional phase space (see also Sects. 5.3.1 and 
5.3.4), 



3 H 3 H 

X n = do 3 

dp q 3 q 



p ■ 



The curve { x 1 o a, x 2 o a}(r) = {<?(r), p(x)} is an integral curve of Xh if and 
only if the equations 

dq 3 H ^ d/? 

dr dp dr 3 q 



are fulfilled. If the phase space is the Euclidean space R 2 , its representation 
in terms of charts is trivial (it is the identity on R 2 ) and we can simply write 
oi(r) = {q{ r), p( r)}. 

The theorem of Sect. 1.19 guarantees that for each p on M there is precisely 
one integral curve a for which p is the initial point (or initial configuration, as 
we said there) p = a(0). Clearly, one will try to continue that curve on M as far 
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as this is possible. By this procedure one obtains the maximal integral curve a p 
through p. The theorem of Sect. 1.19 tells us that it is uniquely determined. One 
says that the vector field X is complete if everyone of its maximal integral curves 
is defined on the entire real axis R. 

For a complete vector field the set of all maximal integral curves 

r) = a p (r) 

yields the flow of the vector field. If one keeps the time parameter r fixed, then 
(p ( p . r) gives the position of the orbit point in M to which p has moved under 
the action of the flow, for every point p on M. If in turn one keeps p fixed and 
varies r, the flow yields the maximal integral curve going through p. We return 
to this in Sect. 6.2.1. 

In Chap. 1 we studied examples of flows of complete vector fields. The flows 
of Hamiltonian vector fields have the specific property of preserving volume and 
orientation. As such, they can be compared to the flow of a frictionless, incom- 
pressible fluid. 

5.4.3 Exterior Product of One-Forms 

We start with two simple examples of forms on the manifold M — R 3 . Let 
K — (A' 1 , K 2 , K 2 ) be a force field and let v be the velocity field of a given phys- 
ical motion in R 3 . The work per unit time is given by the scalar product K ■ v. 

This can be written as the action of the one-form u>k — J];=i K' dx' onto the 
tangent vector v, viz. 

w K (v) = J2 K ‘ l,i = K v . 

In the second example let v be the velocity field of a flow in the oriented space 
R 3 . We wish to study the flux across some smooth surface in R 3 . Consider two 
tangent vectors t and s at the point x of this surface. The flux (including its sign) 
across the parallelogram spanned by t and s is given by the scalar product of v 
with the cross product (xs, 

<P v (t, s ) = u 1 ^ 2 .? 3 — t 3 s 2 ) + u 2 (r 3 j 1 — f 1 ,? 3 ) + luff's 2 — t 2 s l ) . 

This quantity can be understood as an exterior form that acts on two tangent vec- 
tors. It has the following properties: the form & v is linear in both of its arguments. 
Furthermore, it is skew-symmetric because, as we interchange t and s, the paral- 
lelogram changes its orientation, and the flux changes sign. A form with these 
properties is said to be an exterior two-form on R 3 . 

Two-forms can be obtained from two exterior one-forms, for instance by defin- 
ing a product of forms that is bilinear and skew-symmetric. This product is called 
the exterior product. It is defined as follows. The exterior product of two base 
forms dx‘ and dx k is denoted by dx 1 A dx k . It is defined by its action on two 
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arbitrary tangent vectors ,v and t belonging to T p M, 
(dx* A d x k )(s, t) — s't k — s k t l . 



(5.44) 



The symbol A denotes the “wedge” product. As any one-form can be written as 
a linear combination of base one-forms, the exterior product of two one-forms co 
and 9, in each point p of a manifold M, is given by 



(m A 6) p (v, w) = a> p (v)O p (w) - cop(w)6p(v ) 



= det 



((Op (v) 

\0 P (v) 



ca p (w)\ 

9 P (w)J- 



(5.45) 



Here v and w are elements of T p M. To each point p the one-forms co and 9 as- 
sign the elements co p and 9 p of T*M, respectively. The exterior product co A 9 is 
defined at each point p, according to (5.45), and hence everywhere on M. 

In much the same way as the coordinate one-forms d.v' serve as a basis for all 

2 

one-forms, every two-form co can be represented by a linear combination of base 
two-forms dx' A dx* (with i < k), 



co — ^ cojk dx' A dx* , (5.46) 

i <k= 1 

the restriction i < k taking account of the relation dx* A dx' = — dx' A dx*. The 

2 

coefficients in (5.46) are obtained from the action of co onto the corresponding 
base vector fields, 



(Oik = (o(dj, dk) ■ 



(5.47) 



The exterior product can be extended to three-forms, four-forms, and forms of 
higher degree. For example, the A: -fold exterior product is given by 

(co i A (02 A ... A &>a-)(i> ( 1 \ t> (2 \ . . . , v (k) ) = det(co,-(v^)) . (5.48) 

It is linear in its k arguments and it is totally antisymmetric. Any A: -form can be 
expressed as a linear combination of base A' -forms 

dx' 1 A dx' 2 A ... A dx** , with i\ < fa < ... < ik ■ (5-49) 

There are ( " ) such base forms. In particular, if k — 1 or k — n, there is precisely 
one such base form. On the other hand, for k > n, at least two one-forms in (5.49) 
must be equal. By the antisymmetry of the base forms (5.49), any form of degree 
higher than n vanishes. Thus, the highest degree a form on an n -dimensional mani- 
fold M can have is k = n. For k = n the form (5.49) is proportional to the oriented 
volume element of an n -dimensional vector space. 

The examples show that the exterior product is a generalization of the vector 
product in R 3 . In a certain sense, it is even simpler than that because multiple 
products such as (5.48) or (5.49) pose no problems of where to put parantheses. 
The exterior product is associative. 
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5.4.4 The Exterior Derivative 

In the preceding paragraph it was shown that one can generate two-forms as well 
as forms of higher degree by taking exterior products of one-forms. Here we shall 
learn that there is another possibility of obtaining smooth forms of higher degree: 
by means of the exterior derivative , or Cartan derivative. 

Let us first summarize, in the form of a definition, what the preceding section 
taught us about smooth differential forms of degree A. 



DF3. A A-form is a function 

co : M -» ( T*M) k : pv+cd p , (5.50) 

that assigns to each point p e M an element of (T*M) k , the A-fold di- 

k 

rect product of the cotangent space. co p is a multilinear, skew-symmetric 
mapping from ( T p M) k onto the real numbers, i.e. it acts on k vector fields 

w p (X i X k )eR (5.51) 

and is antisymmetric in all k arguments. 



The real number (5.51) is a function of the base point p. Therefore, in analogy 
to DF2 of Sect. 5.3.5, one defines smoothness for exterior forms as follows. 



DF4. The A-form co is said to be smooth if the function cn(X \ X k ) 

is differentiable, for all sets of smooth vector fields X ,• e V(M). Locally 
(i.e. in charts) any such A-form can be written, in a unique way, as a linear 
combination of the base forms (5.49), 

co = ^ coi l ,..i k dx il A . . . A dx ,k . (5.52) 

i\<i 2 <—<ik 

k 

The coefficients are given by the action of co onto the corresponding base 
vector fields 3 j l , , 3 i k . 



Functions on M can be understood as forms of degree zero. As we showed 
in Sect. 5.3.5, the well-known total derivative converts a function into a one-form. 
Indeed, in a local representation we had 
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where dg/dx' are the partial derivatives, i.e. the result of the action of the one-form 
dg onto the base fields 3,, while the Ax 1 are the base one-forms. 

The Cartan or exterior derivative generalizes this step to smooth forms of arbi- 
trary degree. It maps smooth Worms onto (k+ l)-forms, this mapping being linear, 

k k + 1 

d : m -> w . (5.54) 

It is defined uniquely and has the following properties: 



CD1. For functions g on M, dg is the usual total derivative. 

CD2. The action of d on the exterior (or wedge) product of two forms of 
degree k and l is 

k l k l u k l 

d(co A w) — (d to) A co+(—y to A(da>) . 

k 

CD3. The form to being represented locally as in (5.52), the action of the 
exterior derivative on this form is 

da>= ^ da>i l ...i k (x l , . . . , x' 1 ) A dx ' 1 A . . . A dx' k . 

h<-<ik 

Here, d £Oj 1 ...j Jt (x 1 , . . . , x") is the total differential and is expressed in terms 
of base one-forms, as in (5.53). 



This exterior derivative is a local and linear operator. Property CD2 can also be 
described by saying that d is an antiderivation (with respect to the exterior product 
A), in the sense that it obeys the Leibniz rule CD2 with extra signs that depend 
on the degree of the first form. A remarkable property of the exterior derivative 
is that the composition of d with itself gives zero, 



dod = 0 . 



(5.55) 



We prove this assertion for the case of smooth functions g e T(M). We have 
dg = YTi=i($g/dx l )dx l and, according to CD3, 



(d o d)g = d(dg) = d(3g/3x' ) A dx 1 

i 




d 2 g 

dx k dx 



-dx 

l 



k 



A dx' . 



If we exchange dx k and dx' in the second sum in the brackets on the right-hand 
side, and if we relabel the indices by exchanging k and i, we obtain, using the 
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antisymmetry of the wedge product, 

d 2 g d 2 g 



(d od)g = J2 



k<i 



dx k dx' dx'dx k 



dx* A dx 1 = 0 . 



This vanishes because the second, mixed partial derivatives of smooth functions 
are equal. The fact that (5.55) holds for any A' -form follows from this result and 
from the product rule CD2. 



5.4.5 Exterior Derivative and Vectors in M 3 



To illustrate the general and somewhat abstract definitions of the preceding sec- 
tions, we consider the manifold M — R 3 , i.e. the three-dimensional Euclidean 
space of physics. For a smooth function f(x) the exterior derivative gives 

3 

df = J](a//ax ! ')dx ! ' . 

i= 1 

This is the well-known total differential of /. When applied to the base field dk, 
it gives 

d m) = df/dx k . 

This generates the triple {df/dx l , df/dx 2 , df/dx 3 } = V/, which represents the 
gradient of / in R . 

k l 

The exterior product of two forms co and co is an exterior form of degree (k+l). 
Functions have to be understood as zero-forms. Thus, the exterior product of two 
functions / and g is the ordinary product. In this case, rule CD2 is nothing but 
the product rule for differentiation: 

V(/g) = (V/')g + /(Vg) . 



Consider now the one-form 
i 3 

U> a — ^ Clj(x)dx' . 



i = 1 



Its exterior derivative is 



dw n = I - 



dx 2 dx 1 



) dx 1 A dx 2 + ^ dx 1 A dx 3 

J V dx 3 dx 1 ) 

3fl2 da 7 o 
^ 3 +^jdx Adx . 



(5.56) 



(5.57) 



If (fli(x), ci 2 (x ) . «3 (x ) } are understood to be the components of a vector field a(x), 
the coefficients of the two-form dco a are seen to be the coefficients of the curl of 
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a(x). These identifications are specific for the dimension 3 of the space M — R 3 
and do not hold in general. 

The three-dimensional Euclidean space admits a metric (see Sect. 5. 2.1). Fur- 
thermore, it is orientable because three linearly independent vectors define an ori- 
ented volume of the parallelepiped they span. Therefore, if (e\, C 2 , fb) is a set of 
orthonormal vectors in the tangent space T x R 3 , we can assign to each k-form to 
an (n — k)-form, i.e. a (3 — k)-form, denoted through the definition 

(*a>)(ejfc+i> 63 ) = f a>(e\, e k ) , 0<k<n = 3. (5.58) 

This assignment is said to be the Hodge star operation. In R 3 it assigns to every 
three-form a zero-form (a function), to every two-form a one-form, and vice versa. 
For example, we obtain 

*dx 3 = dx 2 A dx 3 (cyclic permutations) (two-form), 

*dx 2 A d.v 3 = dx 1 = *(*dx 1 ) (cyclic permutations) (one-form), 

*dx 3 A dx 2 A dx 3 = 1 (zero-form). 

Assigning the one-form (5.56) to the vector field a(x), its exterior derivative is 
given by (5.57). Applying the star operation to this two-form yields the one-form 

o>/ } = f * dftj fl = Y bj(x)dx l = ( — j \ ) dx 3 + cyclic permutations , 

' VSx 1 3 x 2 J 

1 = 1 

where we have set b{ = daj/dx 2 — dai/dx 2, (and cyclic permutations). Thus, we 
obtain again a form of the type (5.56) whose coefficients are the components of 
curl a(x). This result is due to the dimension of the space R 3 : the star operation 
turns a two-form into a one-form, and vice versa. The space of one-forms has 
dimension ("), the space of two-forms has dimension (!J). For 11 — 3 we have 
( ” ) = (") =3, i.e. these dimensions are equal and the two spaces are isomorphic. 
On the basis of this observation let us work out the relation between the exterior 
product of Sect. 5.4.3 and the vector product in R 3 . For two vectors a and b con- 
struct the one-forms to a and cob, respectively, following the pattern of (5.56). Take 
their exterior product and apply the star operation to it. This gives the one-form 

*(tt> a A u>b ) = (aibi — 02 b] )*(dx' A dx 2 ) + (cyclic permutations) 

= {aibi — U 2 b \ )dx + (cyclic permutations) 

1 

— t^axb ■ (5.59) 

This formula explains in which sense the A-product generalizes the ordinary vector 
product. 

Finally, to a given vector field a(x) we can also associate the following two- 
form: 

2 def 9 3 

co a = flidx - A dx + (cyclic permutations). 



(5.60) 
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Taking its exterior derivative, we obtain a three-form whose coefficient is the di- 
vergence of a. 



2 

d 0>a 



( da\ dci 2 3«3 \ 

Sx 1 3x 2 3 A' 3 / 



d.v 1 A dx 2 A dx 3 . 



(5.61) 



Of course, the star operation can be applied to the two expressions (5.60) and 
(5.61), giving the results 

2 1 2 
*o> a — co a and * (d co a ) = div« . 



The dimension n — 3 is essential if one wishes to interpret the vector product 
a x ft as another vector. This isomorphism does not hold in dimensions other than 
3. Note, however, that the cross product a x ft in R 3 is a vector of a different nature 
than a or ft. For example, a = r (position vector) and ft = p (momentum vector) 
are odd with respect to space reflection, while their vector product l — r x p 
(angular momentum vector) is even. A vector that is even under space reflection 
is said to be an axial vector. 

A final remark: one may be surprised that the one-form (5.56) can be used to 
describe a vector field, even though vector fields have the coordinate representa- 
tion J2 a '(x)d j. The reason for this is that R 3 admits a metric that acts on vector 
fields: g(v, w) with g( 3,-, 3 f) = g ,•*. Interpreting the metric g(v, w) as a mapping 
from w to v shows that it generates an isomorphism between X*(M) and XiM). 



5.5 Hamilton-Jacobi and Lagrangian Mechanics 

In Sects. 5.1 and 5.3.3 we described qualitatively the manifolds of generalized co- 
ordinates as well as their tangent and cotangent bundles on which the Lagrangian 
function and the Hamiltonian function are respectively defined (cf. Fig. 5.2). In 
this section we examine these relations in more detailed and precise terms. We 
study geometric objects that live on the manifolds sketched in Fig. 5.2 and most 
of which are already known to us from Chap. 2. In particular, we define and study 
the so-called canonical two-form on phase space, which describes the symplectic 
structure of phase space (cf. Sect. 2.28), as well as all consequences following from 
this structure (such as Liouville’s theorem, Poisson brackets, etc.). We study the 
Hamiltonian vector fields, (i.e. the canonical equations in a geometric language), 
and the geometric formulation of Lagrangian mechanics, as well as the relation 
between these two descriptions. 

5.5.1 Coordinate Manifold Q, Velocity Space T Q, 
and Phase Space T* Q 

In Sect. 5.3.3 we remarked that Lagrangian functions L(q,q.t) are functions on 
the tangent bundle T Q of the coordinate manifold Q, i.e. L e T(T Q), 



(5.62) 
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In writing this down we have used a local coordinate expression. Indeed, { 5 } = 
[q l , . . . , q * } represents the point u e Q in a chart, / = dim Q being the num- 
ber of degrees of freedom, while [q] = {<7 , . . . , qf } gives the local components 
of an arbitrary tangent vector v„ — jL q l 9/ e T U Q. One should not be confused 
by the notation: the [q] are the tangent vectors to all possible curves y(t) pass- 
ing through u e Q. Only if we are given the solutions q — 0(t . to, q 0 ) of the 
equations of motion (which follow from the Lagrangian function) do their tangent 
vectors generate the velocity field corresponding to real physical motion. 

According to (5.62) L is to be understood really as a function on the manifold 
T Q. It is not a mapping of the kind studied in Sect.5.3.5, which assigns to each 
tangent vector, an element of TQ, a real number (in other words, it is not a one- 
form). Let us analyze this in a little more detail. First, we confirm that T Q, the 
tangent bundle of the smooth manifold Q, is again a smooth manifold of dimen- 
sion dim 7’ Q — 2dim Q. Therefore, it is possible to define smooth functions on 
T Q. (The general prescription is this. Let M be a smooth manifold of dimension 
m. With (<p, U ), a local chart of (M, A) belonging to the complete atlas A, we 
construct the corresponding differential, or tangent, mapping Tcp, following the 
definition (5.40a). With U C M , Tcp maps the domain TU = U x T U M, u e U, 
of T M onto (p(U) x R'”. One then shows that r_4 = {(Tip, TU)} is a complete 
atlas for the manifold T M.) 

In the simplest case a Lagrangian function has the local form (the so-called 
natural form) 

L = T kin (q, q) - V(q) , (5.63) 

where V is a potential, while T^i n is the kinetic energy whose general form could 
be 



7k, n = 1 <7 i Sik(q)q k (5.64) 

U-l 

Here, the tensor gik(q) is the matrix representation of a metric and may depend on 
the base point q. For a single particle in R 3 we have g,k — 8,^, with i, k — 1, 2, 3. 
Of course, a potential that does not depend on velocities, say V(u), is initially 
defined to be a function on Q. However, from Sect. 5.4.1, it can easily be trans- 
ported to T Q. Indeed, if jt : T Q —> Q is the natural projection (5.25), then 
the pull-back of the function V ( u ) 



7t*V — V on 



is a function on TQ. The action of 7r*V on elements v u of T U Q is very simple: 
n projects onto the base point u, i.e. just cuts out the vector component of v u . 

The kinetic energy (5.64), in turn, is defined on T Q from the start, in a nontriv- 
ial way. To understand this better, we first give a precise definition of the metric. 
So far we have dealt with the set of smooth vector fields X(M ) and with the set 
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of smooth one-forms X*(M), cf. Sects. 5. 3. 4-5. 3. 5. The former are also called 
contravariant tensors of rank 1 and one may write equivalently 

X (M) = Tq (M) . (5.65a) 

The latter are also said to be covariant tensors of rank 1 and one writes corre- 
spondingly 

X*(M) — 7j°(M) . (5.65b) 

We have further considered geometric objects that can be understood to be 
tensors of higher rank. For example, the two-forms we generated by taking the 

2 1 1 

exterior product of two one-forms, o> = co a A a>b, are smooth, bilinear mappings 
from the product T M x T M to R. Therefore, they are contravariant tensors of rank 
2 that, in addition, are antisymmetric. In general, a tensor Tf with r contravariant 
indices and s covariant indices is defined to be a multilinear mapping of r copies 
of T*M times s copies of T M onto the real numbers, viz. 

(Tf ) p : (T* M) r (T p M ) s -> R . (5.66) 

A tensor field of type (' ) assigns to each point p e M a tensor (5.66), in much the 
same way as the vector fields (5.26) and the one-forms (5.35) did, both of which 
are special cases of this general definition. The set of all smooth tensor fields of 
type is denoted by Tf(M). 

Here we wish to define the metric, which is another special tensor field. Loosely 
speaking, a metric serves to define the norm of vectors and the scalar product of 
vectors (thereby specifying, in particular, orthogonality of vectors). Furthermore, 
by means of the metric tensor a vector (which is a contravariant rank-1 tensor) 
is turned into a covariant object (i.e, a covariant tensor of rank 1). In either case, 
the metric acts on vectors, i.e. on elements of the tangent space. Keeping this in 
mind, the following definition will be plausible. 



ME. Definition of metric. A metric on a smooth manifold M is a tensor 
field g from Tf(M) (the smooth covariant tensor fields of rank 2), whose 
representative at every point p of M is symmetric and nondegenerate. This 
means that 

(i) gp(v p , w p ) = gpiWp , v p ) for all v p , w p e T p M and at each point 
p e M, and 

(ii) if gpiVp , w p ) = 0 for a fixed v p e T p M, but all w p e T p M, then 
Vp — 0, at every point p e M. 



We can treat the metric as a mapping. In analogy to (5.26) and (5.35) we have 
g e 7f(M) : M -+ T*M x T* M : p m* g p , where (5.67a) 

g p : TpM x TpM gp(V, w ) . (5.67b) 
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Locally, i.e. in local charts, the metric can be applied to base fields, yielding the 
so-called metric tensor 

g P (di,dk) = gik(p) ■ (5.68) 

The requirements ME(i) and ME(ii) then imply that (i) ga(p ) = gki(p), and (ii) 
the matrix {gik(p)} is nonsingular. Its inverse is denoted by g' k . Using the decom- 
position (5.29) of vector fields in terms of base fields, we have 

n 

gp( V, W) = v 'gik(p)w k , (5.69) 

i,k= 1 

where v‘ and w k are the components of v p and w p , respectively, in a local rep- 
resentation of T p M. The same statement can be phrased differently: locally the 
metric tensor can be written as a linear combination of tensor products of base 
one-forms as follows 4 : 

g = gik(P) dx ' ® d* k ■ (5-70) 

i,k 

Equipped with this knowledge we readily understand the structure of the form 
(5.64) of the kinetic energy, which is a function on T Q. Let v u e T U Q be rep- 
resented locally by v u — ^q'dj. Then, obviously, T^ n — g u (v u , v u ). In fact, we 
may say much more than that. If g p , (5.67a), is applied to only one vector field, 
a mapping from TM to T*M is obtained, 

g p : T M -> T*M : w i-> g p (», w) (dot denotes vacancy) . 



In other words, g p (», w ) is a one-form and g p (», w) — co w , which, upon appli- 
cation to a vector v e T p M, yields the real number g p (v, w). Thus, the metric 
assigns to each vector field X e X(M) the smooth one form y ( • . X) e X*(M), 
and vice versa. This is precisely what happens when one introduces (in charts) the 
generalized momenta p, = dL/dq l , which are canonically conjugate to the q 1 . 
Using (5.63) and (5.64) one obtains 

Pi = JT- = X! g’kip)q k = gp (•. X! i k h ) • (5.71) 

The transition from the variables {q l , q J } to the variables {q l , p /} that we studied 
in Chap. 2 in reality means that one goes over from a description of mechanics 
on the tangent bundle T Q to a description on the cotangent bundle T* Q. If there 
exists a metric on Q then there is the isomorphism sketched above, which allows 

4 Using well-known techniques of linear algebra one can show that at each point p 6 M one can 
find a basis such that is diagonal, i.e. g = £,-d*: ! <S>dx ! , with Sj = ±1. If all g; are equal 

to +1, the metric is said to be Riemannian. In all other cases it is said to be semi-Riemannian. 
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one to identify the two pictures. In general, however, this canonical identification is 
not guaranteed. In any case, whether or not a metric exists, T Q and T* Q are two 
different spaces. Therefore, the transition from the Lagrangian formulation of me- 
chanics to the Hamiltonian formulation is more than a simple change of variables. 
Very much like Q and T Q , the cotangent bundle T*Q is a smooth manifold. In 
mechanics T*Q is the phase space. In local charts it is described by coordinates 
{q‘ , Pk), where pt has the character of a one-form, see (5.71). The Lagrangian 
function is defined on T Q, the Hamiltonian function on T* Q (cf. Fig. 5.2). The 
two representations of mechanics are related by the Legendre transformation C, 
as explained in Chap. 2. 

The general case (without assuming a metric on Q ) is treated by Abraham 
and Marsden (1981): mechanics on T Q and its formulation on T*Q are related 
by means of the so-called fibre derivative. We cannot go into this more general 
treatment without introducing further mathematical tools. We point out, however, 
that the restricted case discussed above exhibits all essential features. 



5.5.2 The Canonical One-Form on Phase Space 



The Hamiltonian function is defined on the manifold M — T*Q, which plays a 
central role in mechanics. Figure 5. 11 shows in more detail the manifolds Q, T Q, 
T* Q , and, in addition, the tangent bundle T M of the phase space. We shall return 
briefly to Lagrangian mechanis (on T Q) in Sect. 5.6 below. Here, our goal is to 
work out more clearly the geometric-symplectic structure of mechanics in phase 
space, well known to us from Chap. 2, and to understand it from a higher level. 
One possible approach is provided by what is called the canonical one-form 6q on 
phase space, 




Fig. 5.11. The cotangent bundle 
def 

M = T* Q is the phase space. 
M being a smooth manifold it- 
self, it possesses a tangent bun- 
dle T M = T(T*Q). r q and r* 
are the canonical projections from 
T Q and 7 1 * Q to Q, respectively, 
while tm is the projection from 
TM to M. TM and T Q, in turn, 
are related by the tangent map- 
ping corresponding to Tq 
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00 : M -> T*M :meM i-> (0 o ) m e T*M , (5.72) 

which is defined as follows. Let a be an arbitrary smooth one-form on the coor- 
dinate manifold Q : 

a : Q-+ T*Q : u e Q a u e T* Q . (5.73) 

The form 6q is to be defined on M — T* Q. As a provides a mapping from Q to M, 
we can use it to pull back 0 q from M to Q. This yields the one-form (oi*0q), which 
lives on the base manifold Q. With this remark in mind we state the following 
definition. 



C1F. The canonical one-form 6q is the unique form on M — T* Q whose 
pull-back onto Q by means of an arbitrary one-form a (5.73) yields pre- 
cisely this a. Expressed in a formula, the canonical one-form 0o fulfills 

(a*0 o ) = a for all a e X*(Q) . (5.74) 

This requirement fixes 0 q uniquely. 



As shown in Fig. 5.11, a chart (<p, U) of the domain U C Q induces a chart 
(: T<p , TU) for TU C TQ , as well as a chart (T*<p, T*U) for T*U C M = T*Q. 
A point u e U has the image {i/'J = {(p l {u)}, which belongs to the neighborhood 
U' = (p(U) in . A tangent vector v u e T U Q , with base point u, has the image 
{q 1 — i p l (u), v 1 = Tcp' (v) = q'} in U' x Rf . Similarly, each one-form a> u e T*Q 
has the image {</', a / = /?,-} in U' x (R * )*. Thus, the local representation of a u 
(5.37) reads 

/ / 

a u = E“7(?)d^' = p J dqJ ■ ( 5J5 ^ 

7 = 1 7 = 1 

When expressed in local form, the defining equation (5.74) is in fact very simple: 
(00 )„, being a one-form belonging to T*M — T*(T*Q ) it must have the general 
local form 

(0o)m = ^2 vidq' + ^2 ? k dpk , 

i k 

where cr; and r k are smooth functions of (q , p ). The condition (5.74) requires that 
these functions be 



Oiiq, a(q)) = a t {q, p) = pi , 
r k (q,oi(q)) = r k (q, p) = 0 . 
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Thus, the local form of the canonical one-form is the same as a u (5.75) 

m e M = T* Q . (5.76) 

Note, however, that 6q is defined on the phase space M — T*Q, i.e. that (6o) m 
is an element of T*M, in contrast to the arbitrary one-form a, which lives “one 
storey below”. 

Remark. The canonical one-form is the key to the geometric formulation of 
mechanics on phase space. Starting from the definition given above and making 
use of Fig. 5.11, one can work out the following pattern. Let u be a fixed point 
on the base manifold Q, v u a tangent vector from T„ Q, and a u a one-form from 

dcf 

T* Q . Then r = a u (v u ) is a real number. Using the definition (5.74) we can write 
it alternatively as r — (oc*0o) u (v u ). Now, as a maps the basis Q onto T*Q, the cor- 
responding tangent mapping T a maps T Q onto T M, the base point u being fixed. 
Let w u e T au M be the preimage of a u by the projection r m, i.e. w u — T^(a u ). 
Then we have w u — Ta(v u ), while the same real number r is also given by 

r = 0 Oo)m=a u (u > u ) = a u o TTq(Wu) ■ 

This last equation can be used to define do, (Abraham, Marsden 1981, Sect. 3.2.10). 
With this alternative but equivalent definition the derivation of the local form (5.76) 
is a bit more tedious. 

One can understand that 6q is indeed unique by noting that condition C1F is 
to be fulfilled for all a u . These forms span the space T* Q completely. As the v u 
are arbitrary, too, their preimages w u span the complete space T au M. 

Loosely speaking, C1F is a prescription that says that arbitrary one-forms on 
Q should be interpreted as a specific one-form on T*Q. It is canonical and charac- 
teristic for the cotangent bundle insofar as one-forms live on T*Q and are pulled 
back by mappings (in contrast to vector fields, which are mapped “forward”). The 
local representation (5.76) is sufficient because one can always join together the 
charts of a complete atlas and describe 0q in this way, on the whole of M — T* Q. 
Of course, the definition given in (5.74), or the one described briefly in the remark 
above, are completely free of coordinates. 

Let F = \jj o(p~ l be the transition mapping from the chart (cp, U) to the chart 
(V f , V). In the overlap of the images of U and of V, F maps the point {q} — (p(u) 
to the point {Q} = ij/( u ). This is the same point in Rf , but it is expressed in terms 
of different coordinates. A tangent vector i>„ e T u Q whose coordinate image is 
{q} in the first case and {Q} in the second is transformed by means of the tan- 
gent mapping T F, while one-forms are pulled back according to (5.41). As to the 
canonical one-form, we note that it keeps its local form (5.76). Indeed, we have 
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and therefore 

E pm = E E = E PkdQk ■ ^ 

i i k ^ k 

This result is obvious because the definition (5.74) fixes 9q on the whole of T*Q 
and because the local form (5.76) holds in each chart. The following assertion is 
somewhat less obvious. 



Proposition. Let F : Q -> Q be a diffeomorphism on the base manifold 
Q. With a a one-form on Q the pull-back of a u e T* Q is then defined in 
either direction, so that F induces a diffeomorphism T*F : T* Q — > T*Q. 
Then the pull-back of the canonical one-form is given by 

(T*F)*6 o = 9 0 . (5.78) 

In this sense it is invariant. 



Abraham and Marsden (1981, Theorem 3.2.12) provide a proof that does not 
make use of coordinates. In coordinates, the proof, in essence, follows from the 
calculation done in (5.77). 



5.5.3 The Canonical, Symplectic Two-Form on M 



The canonical two-form is defined to be (minus) the exterior derivative of the 
canonical one-form 9q of C1F (5.74), viz. 



C2F. too = - d<9 0 • 



(5.79) 



This is a closed form, d&>o = —do (10q = 0. Its representation in local coordinates 
follows from the local form (5.76) of 9o . It reads 



/ 

(a>o)m — E d v' A dpi ’ m e M ■ 

i= 1 



(5.80) 



This form is of special importance because it exhibits the symplectic structure of 
phase space. This will be clear from the following observations and propositions. 

As a two-form on M, a> o is a bilinear mapping from TM x TM to the real 
numbers. It acts on pairs (w <fl \ w (fc) ) of vector fields on M, i.e. (&>o)m is applied 
to pairs (wjn \ w™ 1 ) of tangent vectors from T m M , where and u/J’ 1 are the 
representatives in T m M of w {a> and w <h) , respectively. In charts any such vector 
field has the form 



w 



\ - j 3 \ - _ 3 

= 2 ^ w 7-7 + . 

tl tl dpk 



(5.81) 







5.5 Hamilton-Jacobi and Lagrangian Mechanics 



325 



so that 



(wo)m(W^\ 



= £(w (a)i wf b) - w\ a) w (b)i ) . 

i = 1 



(5.82) 



If we agree on ordering coordinates such that rf = dq' , with i — 1 form 
the first set of base forms, and q ,+p — dp,. i — 1 form the second set, 
and if we write (co o) in the general form 

(ojo)m = y^oJikt] 1 A q k , (5.82') 

i,k 



it is easy to see that its coefficients a>,£ are given by 



a>ik — 



( 0/x/ i/xA 
\~ l fxf 0/x// ' 



This matrix is nothing but the matrix J of (2.102). As J is regular, one sees that 
(&>o)/h is nondegenerate and skew-symmetric. As this holds at each point m e M, 
the canonical two-form &>o is nondegenerate and skew-symmetric on the whole of 
M. Thus, the form coo must be closely related to the canonical equations (2.99). 
Before we turn to this relationship we wish to point out an interesting property of 
the cotangent bundle M — T*Q. 

Taking the /-fold exterior products of (a>o)m with itself yields forms of degree 
2k. For example 

/ 

(0J())m A (coo)m = ^ A Api ' A ^ A Ap 'l 
>1.12=1 

= —2! d q n A dq' 2 A dp (| A d pi 2 

>1 <>2 

(coo)m A (coo)m A (co 0 )„, = -3! ^ d q n A d^r' 2 A d^' 3 A dp ;i A d pj 2 A dp/ 3 . 

> 1 <> 2«3 

The form of highest degree that can be constructed in this way has degree 2/. It 
reads 



(fc>o)„, A . . . A (fc>o) m = /!(— ) [//2] d<A A dq 2 A ... A dq f A dpi A ... A dp f , 
/-fold 



(5.83a) 



where [// 2] is the largest integer smaller than or equal to // 2. This /-fold product 
generates the oriented volume form 



def (~l) [//2] 
/• 



ft>0 A ... A ft>o 



(/ factors) 



(5.83b) 
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on T* Q, whose value in the point m is proportional to the expression (5.83a). This 
is an important result. On the cotangent handle of a smooth manifold there always 
exist the canonical forms 9q and cdq and thus also the volume form (5.83a). The 
cotangent bundle of a manifold Q is always orientable, even if its base manifold Q 
is not. At the same time, we have established the basis for Liouville’s theorem. Only 
the result (5.83a) enables us to talk about flows on phase space that preserve vol- 
ume and orientation. As a consequence, the specific properties of phase space that 
we studied by means of the canonical equation (2.99), in the more “pedestrian” ap- 
proach of Chap. 2, rest on an underlying, deeper geometric structure. The following 
subsection is devoted to a short discussion of this structure. (As this is a digression, 
the reader may wish to skip it on a first reading and move on directly to Sect. 5.5.5.) 

5.5.4 Symplectic Two-Form and Darboux’s Theorem 

Very much like the metric on a Riemannian or semi-Riemannian manifold the 
canonical two-form is a covariant tensor of rank 2 on the manifold M. Like the 
metric it is nondegenerate. While the metric pertains to the set of symmetric ten- 
sors, co o belongs to the set of antisymmetric forms of degree two. 

Let M be a smooth manifold of dimension dim M — n, and let co be a covariant 
tensor (a general one, at first), 

co e T?(M) : M -> T*M x T*M : p h* (, co) p . (5.84) 

co is said to be nondegenerate if ( cd) p has this property at every point p e M. 
T p M is a vector space of dimension n, T p M x T p M has dimension 2 n, and (co) p 
maps T p M x T p M onto the real numbers. 

One proves the following assertions. 

(a) If (co) p is symmetric and nondegenerate, i.e, if the matrix com = ( co) p (dj , dk) 
is regular, then there is an ordered basis of T p M and an ordered basis of T*M, 
dual to the former, such that this matrix is diagonal, its eigenvalues being e,- = ± 1 
(cf Footnote 4 to (5.70)). 

(b) If (cd) p is antisymmetric and if the matrix {&>,£} has rank r, then r is an 
even integer and there is an ordered basis of T p M and its dual in T*M such that 



r/2 

(Oj)p = E dx‘ A dx i+r/2 , 

i = l 

i.e. such that the matrix {o>/*} has the form 

/ o n o\ 

{o>«}= -n o o 

\ 0 0 0 / 



with 11 being the unit matrix of dimension r/2. In this case co can be nondegenerate 
only if the dimension n of M is even; the rank r is then equal to r — 2 n. 
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This latter assertion is followed up by the following proposition. 



Proposition. Let co be an antisymmetric two-form on the manifold M (fol- 
lowing the pattern of (5.84)). The form co is nondegenerate if and only if M 
has even dimension, n — 2k, and if the A-fold exterior product co A ... A co 
is a volume form on M. 



Expressed differently, this says that if there exists a nondegenerate, skew- 
symmetric two-form on M, then M is orientable. If this is true, (5.83b) provides 
an oriented volume form, viz. 

(_1)M 

£2 co = - — —co A... A co (Mold) (5.85) 

( 2 * 0 ! 

with k given by dim M — n = 2k. 

The relation to the symplectic group that we studied in Sect. 2.28 becomes 
clear by way of the following definitions and assertions. 

SYF. Every nondegenerate, skew-symmetric two-form a on a vector space V of 
even dimension n = 2k is called a symplectic form. In the case treated 
above, we had a = (co) p and V = T p M. 

SYV. The pair (V , cr) is said to be a symplectic vector space if dim V — 2k and 
if a has the property SYF. 

SYT. Symplectic transformations are defined to be transformations between vector 
spaces that preserve the symplectic structure SYV, i.e. if (V, er) and (W. r) 
are symplectic vector spaces, then 

F : V -> W 

is symplectic precisely if the pull-back of r onto V equals er, F*r — a . 

The vector spaces V and W need not have the same dimension. However, if 
they do have the same dimension n — 2k, F preserves the oriented volume. This 
is seen by showing that F*£2 T = £2 a , where Q T and Q n are the standard n -forms 
(5.85) on W and V, respectively. Symplectic transformations have the following 
property. The symplectic mappings of a symplectic vector space (V, cr) onto itself, 

F : ( V , a) -> ( V , cr) , F*o = cr , 



form the symplectic group Sp 2 f(M)- In order to show this, let us choose that basis 
{<?' } of V for which a has the canonical form 



Wik) 




= J . 



In this basis the transformation F is represented by the matrix {Fj,}, i.e. e" — 
Ylk=i^k ek - condition F*a — cr says that o(e' l ,e'-') must be equal to 
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cr(e' , e-i), i.e. that F T JF = J. This is precisely (2.113) and tells us that the matrix 
F pertains to Sp 2 f (R). 

Note that the definitions and assertions given above apply to the representative 
( co) p of co over the base point p e M. They are extended to co, and thus to the 
whole of M , by means of the following theorem. 



Darboux’s Theorem. Let co be a nondegenerate two-form on the manifold 
M whose dimension is therefore even, dim M — n = 2k. The form co is 
closed, i.e, d co = 0, precisely if for each point p e M there exists a chart 
(<p, U) such that (pip) — 0 and such that in every point p' e U C M with 

(pip’) = ix l ip'),...,x k {p’),...x 2k ip’)) 

co admits the local representation 

k 

co — ^ dx l A dx l+k (5.86) 

;=i 

on the neighborhood U . 



For the proof of this theorem, as well as of the other assertions of this section, 
we refer to Abraham and Marsden (1981). 

We close this digression with some definitions and remarks that serve the pur- 
pose of generalizing definitions SYF, SYV, and SYT to arbitrary manifolds. 



51. A symplectic form on a manifold M of even dimension dim M = n — 
2k is a nondegenerate, skew-symmetric, closed two-form co, 

dw = 0 . (5.87) 

52. A pair (M, co), with co having property SI, is said to be a symplectic 
manifold. 

53. Those charts where (5.86) holds true (whose existence is guaranteed by 
Darboux’s theorem) are said to be symplectic charts. Their local coordinates 
are called canonical coordinates. 

54. A smooth mapping F that relates two symplectic manifolds (Af, a) and 
( N , r) is said to be symplectic if F* r — a . The symplectic mappings are 
the canonical transformations of mechanics if the starting and the target 
manifolds are identical. 



These notions belong to what is called symplectic geometry. As far as mechan- 
ics is concerned, the importance of symplectic geometry should be clear from our 
discussion. In fact, it seems to be relevant for many more parts of physics and there- 
fore leads directly into modern research. In this connection we refer the reader to 
Guillemin and Sternberg (1986). 






5.5 Hamilton-Jacobi and Lagrangian Mechanics 



329 



5.5.5 The Canonical Equations 

In Chap. 2, Sect. 2.25, we showed that the canonical equations (2.45) could be writ- 
ten in the form (2.99), viz. 

x = J H x = (X H ) x (5.88) 

Here, x is a point in phase space, while H x and J are defined as in (2.102). We 
realize that (5.88) is a local representation in charts. As indicated by the subscript 
x, the Hamiltonian vector field on the right-hand side of (5.88) (cf. the definition 
in Sect.5.3.1) is a coordinate expression in charts. On the basis of the results ob- 
tained in Sect. 5.5.3 it is clear that the canonical two-form will serve the purpose 
of formulating the canonical equations of motion in a coordinate-free manner, i.e. 
directly on T*Q, the cotangent bundle of the coordinate manifold Q. 

Let M = T*Q , as before. Vector fields on M assign to each p e M an element 
of the tangent space T p M at that point, 

V e X(M) : M -» TM : p h* X p . 

In charts X p has the local form (5.81). Equation (5.88) defines the Hamiltonian 
vector fields in charts, i.e. componentwise. Thus, in the notation of (5.88), 

, dH dH 

(V H )' = — , (A H )* = -— f (5-89) 

dpi oq 

These partial derivatives of H also appear in the exterior derivative dH. As H 
is a function on M . its exterior derivative is equal to the total differential. When 
expressed locally, we have 



J-^dH , dH 

d H — d<? + dp j • 

tl jz t d PJ 



As we know, the Hamiltonian vector field is 



( d JL\ 

(Ah)' \ = / 0 1\ dq ' 

(Xnh) V-n o) ’ 
\dpjj 



or (Ah)* = J(d//) T , where the subscript x is meant to indicate that we still com- 
pare coordinate expressions. 

As J -1 = — J, we can also write — J(Ah)* = (d H) x . From this we can ab- 
stract the coordinate-free definition of the Hamiltonian vector field as follows. J 
is nothing but the local matrix representation (5.82') of the canonical two-form 
coo- Such a two-form o> acts on pairs of vector fields. In analogy to the case of 
the metric, one may instead take u> to act on only one vector field, e.g. co(V . •) 
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with the dot denoting a vacancy (it stands for the missing second argument). As 
such it maps the tangent bundle T M onto R, i.e. it operates like an exterior form 
of degree 1. With this remark in mind, the following definition becomes readily 
understandable. 



HYF. Let (M, co) be a symplectic manifold, i.e. dim M — 2/ is even and 
co has properties SI. The Hamiltonian function H is assumed to be given 
as a smooth function on M — T*Q. The Hamiltonian vector field X\\ is 
defined through the condition 



tt>(A H , •) = AH . 



(5.91) 



The triple (M, to, Xu) is said to be a Hamiltonian system. 



With Y e X(M) an arbitrary vector field on M, we have from (5.91) 
co(X H , Y) = d H(Y) . 

As co is nondegenerate, this equation fixes Ah uniquely. Indeed, if there were two 
different vector fields Ah and X' u for the same function H, then co( Ah — A^, Y) — 
0 for all Y. This is possible only if Ah — Ajj vanishes identically. On the other 
hand, d H(Y) cannot be zero for all Y , unless II — 0. Hence, for each H there 
is a unique Ah- In local coordinates the defining equation (5.91) yields precisely 
the expressions (5.89). This is verified by direct calculation, 

/ / 

(*>p{X H , •) = ^(AH)'dp,- - ^2 (Xn) k d q k ■ 
i = 1 k= 1 

Comparing with d H (5.90) yields (5.89). The definition (5.91) is independent of 
coordinates, however, and it is not restricted to the case of finite dimension. 

The integral curves of the vector field Ah, i.e. the solutions of the differential 
equation 

y(t) = (AhV(0 , (5.92) 

describe the possible physical motions of the system defined by the Hamiltonian 
function H . When expressed in local coordinates, (5.92) becomes (5.88) and hence 
the local form of the canonical equations of motion (2.45). 

If H has no explicit time dependence and if y(t) is a solution of (5.92), then 

^ H(y(t )) = d H{y) = dH(X H (y(t))) = w(X H (y), A H (y)) = 0 . 
dr 

This is the well-known fact that H is constant along solutions of the equations of 
motion. 
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It is not difficult to formulate once more Liouville’s theorem, Sect.2.29, using 
the tools and results developed so far. When phrased in geometric terms it reads 
as follows. 



Liouville’s Theorem. Let (M, u>, Ah) be a Hamiltonian system, i.e. let the 
nondegenerate, closed two-form co and the Hamiltonian vector field An be 
given on a manifold with even dimension. Denote by 0, the flow of the 
vector field Xu (this is the set of all integral curves corresponding to all 
possible initial conditions). For all t the flow 0, is symplectic, i.e. 0*a> = 
a). As a consequence, the oriented volume L2 W (5.85) is conserved. 



In Sect. 2.29 we proved this theorem in two equivalent ways. The proof in terms 
of geometry is instructive in several respects. The reader who wishes to skip it, 
on a first reading, should move on immediately to Sect. 5.5.6. The proof makes 
use of the Lie derivative and of the fact that the symplectic form a> is closed. The 
Lie derivative Lx, which refers to a smooth vector field X, is obtained from the 
following geometric picture. The vector field X defines (at least locally on M) the 
flow 0 T , i.e. the set of all solutions of the differential equation (5.42). Consider 
an arbitrary differentiable geometric object T on M such as a function, another 
vector field, a k - form or an (' ? ) -tensor field. We ask the question in which way 
the object T changes differentially, along the lines of the flow 0 T of the vector 
field X. For a function the answer is very simple. At the point p e M this is just 
the directional derivative 



d f P (X p ) = (L x f) p , 



described in (5.33). The same derivative can also be written as 



-p/OMp)) 

dr 



= -r *?/(/>) 



r=0 



dr 



r=0 



where 0 T= o ip) = p and where the right-hand side is to be understood as in (5.39). 
If T is another vector field T = Y, its Lie derivative is given by the commutator 

dcf 

[A, Y ] = LxY, as explained in Sect. 5.3.4. (One may define Lx to be a differen- 
tial operator on the smooth tensor fields on M, with the condition that it operate 
on functions and on vector fields as described above, see Abraham and Marsden 
(1981). The following definition is equivalent to this.) 

By the existence and uniqueness theorem for differential equations of the type 
(5.42) the flow 0 T of A is a (local) diffeomorphism of M. Therefore, the geometric 
object T can be transported forward or backward along that flow (cf. Sect. 5.4.1). 
In particular, it can be differentiated along the flux lines of <J T 5 . 



5 



For this reason V.I. Arnol’d (1978) calls the Lie derivative the fisherman’s derivative. The fisherman 
sees only the river in front of him. He sees all kinds of objects floating by on the river and 
takes their differential along the lines of the river’s flow. 
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Consider now the special case T = a being an exterior A' -form on M, X a 
vector field, and 0 T its (local) flow. According to what we said above, the Lie 
derivative fulfills the identity 



— 0*a = 0*L x a . 



(5.93) 



The Lie derivative L x , at the point q = 0 T ( p), pulled back to the point p, is 
the derivative with respect to the orbit parameter r of the pull-back of the form a. 
Like a, L x a is a Worm. Functions are to be read as zero-forms for which L x f — 
d /(A). One can show that the Lie derivative can be expressed by means of the ex- 
terior derivative. If the vector field A is inserted in the position of the first argument 
of the form a, then a(X, »(k— 1 )•) is a (k— l)-form (positions 2 to k are vacant). 
Taking the exterior derivative of the latter yields again a Worm, d(a(A, »(k— I )•)). 
If, in turn, we differentiate a first we obtain the (k + l)-form dry. Inserting X into 
this (k+ l)-form leads again to a k-form, namely (doi)(A, •(&)•). 6 We then have 



L x a — (da)(A, •(&)•) + d(a(A, •(k — !)•)) . 



(5.94) 



(The proof goes by induction, see e.g. Abraham and Marsden (1981).) With the 
identities (5.93) and (5.94) Liouville’s theorem follows immediately. Inserting the 
symplectic form ok as well as the Hamiltonian vector field An, we obtain 

^0*(O=0*L Xk (o 

= 0*[(dco)(X H , ., .) + d(o>(A H , •))] . 

The first term vanishes because oo is closed. The second vanishes, too, because 
d (w 1 An , •)) = do d H — 0, by the definition (5.91). Finally, as 0 t = o is the iden- 
tity, we obtain 0*o> — o>, for all t for which the flow is defined. This proves the 
theorem. 



5.5.6 The Poisson Bracket 

An essential ingredient in the proof of Liouville’s theorem is the fact that the sym- 
plectic two-form &> is closed. In this section we establish (once more) the relation- 
ship between this form and the Poisson bracket, with the aim of understanding 
better the significance of dw = 0. (In Sect. 2. 32 we showed that the Poisson bracket 
of two dynamical quantities is identical to the symplectic, skew-symmetric scalar 
product of their derivatives, hence the comment “once more”.) 

The dynamical quantities / and g that are to be inserted in the Poisson bracket 
(2.122) are smooth function on the phase space M — T*Q. M is a symplectic 
manifold. Following the example of the Hamiltonian function (which is a smooth 

® This prescription is called the inner product: i x<oi Y \ . .... Yp) C = a(X, ) ] Y0 is said to be 

the inner product of X with a. The indentity (5.94) then reads Lx& = ix(da.) + 
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function on M, too), we can assign to / and g vector fields X f and X g , respec- 
tively, by means of the definition (5.91). As co is nondegenerate, the vector fields 
are uniquely fixed by the equations 



w(Xf,») = df and &>(Ag,») = dg. (5.95) 

The Poisson bracket of / and g is nothing but the expression 

[f,g} = co(X g ,X f ). (5.96) 



To see this, let us interpret (5.96) as a definition and let us verify that locally (i.e. 
in charts) it is the same as (2.122). From (5.95) we have the local representation 
of X f . 



*f = 



( 3/ _9A 
\ 3 P 3 ?/’ 



and an analogous one for X g . Inserting these into co, we find, according to (5.82), 
that 



co(X g , X f ) 



H( _3J f _\ 
dp V 3 q) 




V 

dp 



= If, g } , 



i.e. precisely the expression (2.122). While the latter form is formulated in charts, 
the definition (5.96) is free of coordinates on M. 

The properties of Poisson brackets, well known to us from Chap. 2, can also 
be formulated and proved in a manner that is independent of coordinates. One has 
the following. 

(i) The Poisson bracket can be expressed in terms of Lie derivatives, viz. 



{/, g} = L Xf g = dg(Xf) = ~L Xg f = -d f(X g ) . (5.97) 

(The reader should verify this in local form.) 

Comparing this with the definition (5.93) of the Lie derivative yields assertions 

(ii) and (iii). 

(ii) The quantity / is constant along the flow of X g if and only if {/, g} = 0. 
The same statement holds with / and g interchanged. For example, let i// z be the 
flow of X g . Then, from (5.93) 

^Wrf) = ^(/ ° * r) = #?L Xg f = - W, g} ■ 

This is zero if and only if the Poisson bracket vanishes. 

(iii) Let <P t be the flow of the Flamiltonian vector field Ah, g being a dynamical 
quantity as above. In the same manner as in (ii) one shows that 



d 

— (g o 0 t ) — {H,g o 0,} . 



(5.98) 
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If g does not depend explicitly on time, this is identical with (2.128). As we 
know, the canonical equations themselves can be written in the form of (5.98), cf. 
(2.127). What we have gained compared to Chap. 2 is this: the definition (5.96), 
the expressions (5.97), and the equations of motion (5.98) are formulated in a way 
independent of coordinates (without charts). Furthermore, they are not restricted 
to finite dimensions. 

There are many more properties of Poisson brackets that can be derived using 
the geometric formulation. As we studied them in some detail in Chap. 2, though 
using a local representation, we restrict the discussion to a few characteristic ex- 
amples. 

The smooth functions T ( M ) on the phase space (which form a real vector 
space), together with the Poisson brackets, generate a Lie algebra. In order to see 
this, we must verify that {/, g} is bilinear, that {/, /} vanishes, and that the Jacobi 
identity holds true, viz. 

{/, {*, h}) + {g, {h, /}} + {h, {/, g}} = 0 . (5.99) 

In local form, this identity was obtained by direct calculation, cf. Sect. 2.32 (2.131). 
In a coordinate-free framework one proceeds as follows. Define a Poisson bracket 
for one-forms d /, dg (instead of functions, as above), by 

{df,dg} = a>(lX f ,X g i,»). (5.100) 

This Poisson bracket is again a one-form and we have d{/, g} = {d /, dg}. The last 
equation establishes the relation to the Poisson bracket of functions. (Abraham and 
Marsden (1981) provide a proof.) With this result, and on the basis of the defini- 
tion (5.100) as well as (5.95), we conclude that the vector field X[f,g) defined by 
a>(X{f g ), •) = d{/, g} equals the commutator of Xf and X g , = [X f, Z g ]. 

In a second step we write out the individual terms of (5.99), making use of 
(5.97): 

[f,{g,h}}=L Xf {L Xg h) , 

{, g,{h,f }} = -L Xg {L Xf h) , 

{/*< {/, g}} = ~ L x tfig) h = —[L Xf , L Xg ]h . 

In the last expression we made use of the property L[ VtW ] = [L v , L w ] of the Lie 
derivative. Adding the three terms indeed yields the identity (5.99). We have only 
sketched this proof here, because we had something else in mind: for the defini- 
tion (5.96) of the Poisson bracket, together with the definition (5.95) of the vector 
fields corresponding to the functions / and g, it was essential that the canonical 
two-form was closed. Finally, then, this is the reason the algebra of the smooth 
functions with the composition {, }, is a Lie algebra. 

The following proposition is of interest in the light of the discussion in 
Sect. 2.32. 

Proposition. Let (<p, U) be a chart taken from the atlas for the symplectic manifold 
(M, &>), chosen such that points u e U are represented by q l , ... , q ' , p\, . . . , pf. 
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This chart is symplectic (i.e. the canonical two-form becomes co — Y2'l = \ dq‘ Adp,) 
if and only if the following Poisson brackets are fulfilled: 

{q\qi} = 0 = { Pi , Pj }, { Pj ,q 1 } = 8) . (5.101) 



Proof, (a) If the chart is symplectic one verifies (5.101) by direct calculation, 
(b) We assume these equations to hold and determine the matrix representation 
£2 = (a>ik) of co in the domain of this chart (cp. U). £2 is regular and hence has 
an inverse (o lk ) = E . From (5.97) and (5.96) we have 

{q‘,q k } = d q‘{X qk ) = (X qk )‘ = a ik , i,k = \,...,f . 

In a similar fashion one shows that {/>,-, p k ) = cj , + f' k+ f and {q 1 , p k } — o'- k+ t — 
—o k+ f’ 1 . By assumption 



E 




J J , 



where J is defined as in (2.102). We conclude that £2 — J and hence that the chart 
is symplectic. □ 

Finally, the invariance of Poisson brackets under canonical transformations 
(2.124) is rediscovered in the following form. Let F be a diffeomorphism connect- 
ing two symplectic manifolds, F : (M, co) -a ( N . q). This mapping is symplectic 
precisely if it preserves the Poisson brackets of functions and/or one-forms, i.e. 

{F*f,F*g} = F*{f,g} for all f,geT(N). 



In this case F* preserves the Lie algebra structure on the vector space of the smooth 
function. 



5.5.7 Time-Dependent Hamiltonian Systems 

The preceding sections 5.5. 1-6 gave an introduction to the mathematical founda- 
tions of the theory of Hamiltonian and Jacobi. They should be sufficient to study 
the theory of time-dependent systems as well, without any major difficulties. We 
restrict our discussion to a few remarks and refer to the more specialized, math- 
ematical literature on mechanics for more details. 

If the Hamiltonian function depends explicitly on time, // : M x R r -* R, 
then also the corresponding Hamiltonian vector field depends on time, i.e. assigns 
to each point (m , t) of the direct product of phase space and time axis, a tan- 
gent vector in T m M x R. The manifold Mxl, cannot be symplectic because 
its dimension is odd. However, the canonical two-form a> has maximal rank on 
MxR|, namely 2/, where / = dim Q. In a local chart representation of (m, t) e 
U x R f , U C M, viz. (q l , . . .qF p\, , pf, r), the canonical two-form reads 
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a\u = ^dq 1 Ad pi 

according to Darboux’s theorem, provided the chart is a symplectic one. As a> was 
given by a) = — d 9, we have locally 

d(0 - J2 W) = 0 . 

The form in parentheses is closed. Hence, locally, according to Poincare’s lemma, 
it can be written as the exterior derivative of a function, i.e. 

9 = Pi&q' + dr . 

Note that the exterior product 9 A &9 A . . . A d0, with / factors d 9, is a volume 
form on Mxl,. 

It is not difficult to generalize the time-independent situation discussed in the 
previous sections to the case of time-dependent Hamiltonian vector fields. For ev- 
ery fixed teR, such a vector field 

X : M x R -» TM 

is a vector field on M. One associates with it a vector field XonMx R, 
X:MxI-> T(M x R) = TM x TR , 

(= means isomorphic) through the assignment 
(m, t ) — »• ( X(m , r), (f, 1)) . 

Regarding the integral curves of X, we can say the following. Let y : I -a M be 
an integral curve of X going through the point m. Then y : I — > M x R is the 
integral curve of X, passing through the point (m, 0), precisely if y(t) — (y(f), t). 
This is easily verified. Write 

?(t) = (y(t), r (0) - 

This is an integral curve of X provided 

K , (t) = (/(f),r , (r)) = X(K(r)). 

i.e. provided 

y'(t) = X{y{t),t) and r '(f) = 1 . 

However, as r(0) should be equal to 0, we conclude that r(t) — t. The flux of X 
is expressed in terms of the flux of X , viz. 

&,(m, 5 ) = ((t + s), . 
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Let M be the phase space and let H be a time-dependent Hamiltonian function 
on Mxl. Then H(m, t), for fixed t, is a function on M, 

def 

H t (m) — H(m , t) : M -> R. , 

whose vector field Xu is determined as before. Define the vector field 



Z H : M x R. -* TM : (m, t) h* X Hl (m) 



as well as the corresponding vector field Xu, to be constructed as above. The cor- 
responding integral curves of Xu move across Mxl, those of Xu move across 
M. The latter are identical with the phase portraits introduced in Chap. 1. 

The canonical equations of motion hold in every symplectic chart. So y : I — > 
U, with / cl and U C M, is an integral curve of X\\ if and only if the equations 



Uq i (Y(t))] = dH(y(t),t)/d Pi 
at 

))] = —dH(y(t), t)/dqi 



/ = !,.../ 



are fulfilled. 



5.6 Lagrangian Mechanics and Lagrange Equations 

On the one hand, the Lagrangian function is defined as a smooth function on the 
tangent bundle T Q of the coordinate manifold Q, L : T Q -» R. As we know from 
Chap. 2, on the other hand, it appears in the expressions for the Legendre transfor- 
mation from Lagrangian mechanics, formulated on T Q, to Hamilton-Jacobi me- 
chanics, which lives on T*Q, and vice versa. The geometric approach shows very 
clearly that this is more than just a simple transformation of variables. The formu- 
lation of Hamilton and Jacobi is characteristic for the cotangent bundle T* Q. The 
aim of this section is to show that Lagrangian mechanics is rather different from 
this, also as far as its geometric interpretation is concerned. The main difference 
is that on the tangent bundle one can define differential equations of second order 
(i.e. the Euler-Lagrange equations well known to us), in a natural way, while this 
is not possible on T*Q. 



5.6.1 The Relation Between the Two Formulations of Mechanics 



When expressed in local coordinates, the first step of the Legendre transformation 
is the assignment 



0L : {q , q 1 } 



dL def 

w Pj 



(5.102) 
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Going back from the charts to the original manifolds T Q and T*Q, (5.102) says 
that we assign to an element of T U Q, for fixed base point u e Q, an element of 
T* Q by means of derivatives of the Lagrangian function. In other words, the fibre 
T u Q over u e Q of the tangent bundle TQ is mapped to the fibre T* Q of the 
cotangent bundle over the same base point. This mapping is linear and makes use 
of the partial derivatives of the Lagrangian function within the fiber T u Q (in charts: 
q is fixed, the derivatives are taken with respect to q). Thus, let v u be an element 
of T u Q, the fibre of T Q in u. Denoting the restriction of the Lagrangian function 
to this fiber by L u , the mapping ( P\ (5.102) corresponds to the assignment 

:T„Q-> T*Q : v u DL u (v u ) , (5.103) 



where D denotes the derivatives of L. The precise definition of D on manifolds 
would lead us too far from our main subject. Therefore, the following, somewhat 
qualitative remarks that clarify matters in charts may be sufficient. Let (cp, U) be 
a chart taken from the atlas for Q and ( T cp , TU) the induced chart for TQ. 
denotes the restriction of the Lagrangian function to the domains of these charts. 
Then L^oT<p~ l is a function on IR-^ x , as shown schematically in Fig. 5.12. 




Fig. 5.12. The Lagrangian function is defined 
on the tangent bundle T Q (velocity space). 
Its representation in charts o Tcp~^ is 
the local form that one knows from Chap. 2 



Denoting the derivatives with respect to the first and the second arguments by 
D\ and D 2 , respectively, we have 



D x L iip) o 7>“' 
D 2 L (<p) o T(p~' 





(5.104a) 

(5.104b) 



The derivative DL U of (5.103) leaves the base point u unchanged. Hence it is of 
the type (5.104b). 

<£>l being a mapping from TQ to T*Q that is induced by the Lagrangian func- 
tion, the canonical forms C1F (5.74) and C2F (5.79) can be pulled back from T* Q 
to TQ. If 0 L is a regular mapping 7 , it is symplectic, so that canonical mechanics 
on T* Q can be pulled back to T Q. If, furthermore, 0| is a diffeomorphism, then 
the two formulations of mechanics are completely equivalent. As we know from 



A mapping 0 : M — > N is said to be regular in the point p e M if the corresponding differential, 
or tangent, mapping from T p M to T0^N is surjective. 



7 
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Chap. 2, this is true if and only if (in charts) the matrix of the second derivatives 
of L with respect to q is nowhere singular, i.e. if 

/ d 2 L \ 

det Uw ) #0 (5 - 105) 

holds on the domain of definition of the problem. Strictly speaking, one should 
distinguish the cases where ( P\ is regular from those where, in addition, it is a 
diffeomorphism. In the first case, the condition (5.105) holds only locally while 
in the second it holds on the domain of all charts. For what follows we assume 
that L is chosen such that 0\ j is a diffeomorphism. 

5.6.2 The Lagrangian Two-Form 

The canonical two-form ooo, defined by (5.79), can be pulled back to 7’ Q by means 
of 0\ . This yields what is called the Lagrangian two-form 

Hpf 

co L = 0£w o. (5.106) 

The pull-back of too, the canonical two-form on T*Q, to o>\ on T Q is defined as 
described in Sect. 5.4.1 (5.41). Very much like coo the form <o\ is closed, 

d&)L = 0 . 

This follows because the exterior derivative of the pull-back of a form d(F*a>) is 
equal to the pull-back F*(dcu) of the exterior derivative of the original form (see 
also Exercise 5.11). 

Furthermore, the operation of pull-back commutes with the restriction to open 
neighborhoods on the manifold, on which a given form is defined. For F : M — > N 

k 

and to an exterior k- form on N , one has 
(F* a>)\u C M = F*(a> \f(U)<zn) ■ 

Therefore, the expression of c»\ in charts can be computed from the local represen- 
tation (5.80) of coq. Let U be the domain of a chart on Q and TU the corresponding 
domain on T Q. Then we have in the domain of the chart (<p, U) 

TU — (^ , L<Wo)l7't/ = <?L(tt>ol7’*t7) 

= 0 L (XI dc l' A d P‘) 

= Xd(W)Ad «pj» . 

Here we have used the equality F*(p A r) = ( F*a ) A (F*r) for two exterior 
forms cr and r, as well as the fact that the exterior derivative commutes with F*. 
The last expression for &>l contains the functions q' and /?/,-, pulled back to T Q, 
for which we have 
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0tq' = q' , 0*p k 



dL 
d q k 



Thus we find 

x . . 9i 

"lI tu = 2^ dc i Ad ^7 • 



The exterior derivative of the function dL/dq k is easily calculated with the rules 
of Sect. 5.4.4. Thus, we obtain 



^lru = 'E (^rV A d„‘ + A djj . 



(5.107) 



The same result is obtained from the pull-back to T Q of the canonical one-form 

Hpf 

(5.74), 0l = ‘J’l 0q- In charts it reads 



dL 






Taking the negative exterior derivative, o>\ = — dtT, yields again the expression 
(5.107). 

Thus, if the mapping <P\ is regular, or even a diffeomorphism, then <P\ is sym- 
plectic: it maps the symplectic manifold ( T*Q , w o) onto the symplectic manifold 
ao.cn). 



5.6.3 Energy Function on T Q and Lagrangian Vector Field 

In discussing the Legendre transformation in Chap. 2, we considered the function 

E(q,q, t) = ~ L{q,q, 0 , (5.108) 

which led to the Hamiltonian function, after transformation to the variables q and 
p (taking account of the condition (5.105)). For autonomous systems this was the 
expression for the energy, the energy then being a constant of the motion. Given 
the Hamiltonian function and the canonical two-form o>o, the Hamiltonian vector 
field was constructed following the definition HVF (5.91). A similar construction 
can be performed on T Q. For that purpose we first define the function E on the 
manifold T Q, its chart representation being given by (5.108) above. With it e Q, 
v u e T Q, the first term on the right-hand side of (5.108) is given a coordinate-free 
meaning by the definition 

W : T Q ^ R : v u ^ 0 L (v u ) ■ v u (5.109a) 

According to (5.103), (P\(v u ) is a linear mapping from T U Q to R, i.e. it is an 
element of T*Q, which acts on v u e TQ. One verifies easily that, in charts, W 
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is indeed given by the first term on the right-hand side of (5.108). W is said to 
be the action. 

The energy function, understood to be a smooth function on T Q, is then de- 
fined by 

£ d =W-L. (5.109b) 

We follow the analogous construction on phase space, Sect. 5.5.5. We take the 
exterior derivative of E and define the Lagrangian vector field by means of the 
Lagrangian two-form col, as follows. 



LVF. Given the function E — W — L on TQ. as well as the two-form 
col — with @l being a regular mapping (or even a diffeomorphism), 

the Lagrangian vector field Xl is defined uniquely by 

a> L (X E , •) = dE . (5.110) 



In local form E is given by (5.108) and therefore 



dE\ 



TU 



i,k 



- V ( — 8 ‘ 



3 q' 



: 3 2 L 
dq k dq' 



3 2 L 



dq k + Y q' — rdq 

1 dq k dq' 1 



i.k 



dL . i , x — ' 3 L . 



E or j. \ ' or 



i.k 



i.k 






ik 



dq k dq' 3 q 



d<7 



It is instructive to write out explicitly the local form of (5.110) as well as the vec- 
tor field Xe ■ For the sake of simplicity, we do this for the case of one degree of 
freedom, f = 1. The general case is no more difficult and will be dealt with in 
the next section. Let 3 and 3 denote the base fields 3/3 q' and 3/3 q' , respectively. 
Then, in coordinates, the Lagrangian vector field is Xe = vd + vd, while another, 
arbitrary vector field reads Y — wd + wd. From (5.107) we have 



o>l(Xe, Y) = 



3 2 L 

— T^{vw — vw) , 
3 q- 



while the action of dE on Y gives the local result 



d E(Y) = 




3 L\ .3 2 L 

T q ) w + q W- w 



Inserting these expressions into the equation col(Xe, Y) — dE(Y) and comparing 
the coefficients of w and w, we obtain 

(d L . 3 3 L\ /3 2 L 

V = q ’ V ={^- q dq^)/W 

It is seen that the condition (5.105) is essential. 
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We now follow the pattern of (5.92) and try to determine the integral curve of 
the Lagrangian vector field Xg, 

c(t) = (Xe)c(O • 



Here c : I — > R. is a curve on T Q. In charts c(t ) is J and obeys the differential 
equations 



q(1) = v = q. 



q{t) = v — 



/ dL . 9 dL\ 
V dq ^ dq dq ) 




While the first of these just tells us that the time derivative of the first coordinate 
is equal to the second, the second equation has a somewhat surprising form. True, 
it is obtained from the Euler-Lagrange equation 



dL d dL 

dq dt dq 



by taking the derivative with respect to t and by solving for q. However, it is a 
differential equation of second order and therefore, geometrically speaking, it is 
different from the canonical equations (5.92). Let us consider this new feature in 
more detail. 



5.6.4 Vector Fields on Velocity Space T Q and Lagrange Equations 

A smooth vector field X that is defined on the tangent bundle T Q of a manifold 
Q leads from 'I Q to T(T Q), the tangent bundle of the tangent bundle, 

X : T Q T(T Q) . 

Let r q denote the projection from TQ to Q and T r q the corresponding tangent 
mapping. The composition Ttq o X maps T Q onto T Q, as shown in Fig. 5.13. 
If this composition produces just the identity on T Q , i.e, if 'I’tq o X — id^y, the 





Fig. 5.13. A vector field on T Q generates an equation of second order if 
it fulfills the condition (5.111) 
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vector field X defines a differential equation of second order. This follows from 
the following proposition. 



Proposition. The smooth vector field X has the property 




T Xq o X = id tq 


(5.111) 


if and only if each integral curve c : I — > T Q of X obeys the differential 
equation 


{Xq O C)* = C . 


(5.112) 



Proof. For each point v u e T Q there is a curve going through that point such 
that c(r) = X{c{ t)), with r e 7. Ttq o X is the identity on TQ precisely if 
Txqo c(r) = c(r) holds true. Working out the left-hand side, we obtain 

Txq o c(t) — Txq o T c{ x, 1 ) — T ( xq o c)(t, 1) = (xq o c)*(r) , 

which proves (5.112). □ 

From a physicist’s point of view the integral curves c of X are not exactly 
the solutions one is looking for. Rather, we are interested in the orbits y on the 
base manifold Q itself. These are the physical orbits in the manifold of gener- 
alized coordinates (the ones one can “see”), i.e, the orbits that we denoted by 
& S 't(q o) in earlier sections. It is not difficult, however, to obtain these curves from 
c (integral curve of X on T Q) and from xq (projection of T Q on Q). Indeed, 

dcf 

y = xqoc: I -*Q is a. curve on Q, since c : I -> T Q and xq : T Q Q. 
A curve of this kind that is associated to the vector field X is said to be a base 
integral cur\’e. The condition (5.112) can be written as y = c, which means the 
following: the vector field X defines a differential equation of second order if and 
only if each of its integral curves is equal to the derivative of its corresponding 
base integral curve y — xq o c. 

In charts the Lagrangian vector field Xe reads 



Xe — Xj V> di + Xj 



V'di 



flPT • — (1f*t 

with 3; = 3/3 q‘ , 3/ = 3/3 q' , cf. Sect. 5.6.3. Then v' = q 1 , while the components 
n ! = v l (q,q) fulfill the differential equations 



d 2 
d t 2 



q'(t) = v‘(q(t)),q(t )) . 



(5.113) 



As before, ^ is the local representation of the point c(t) or y(t). Of course, 
in charts one obtains the well-known Euler-Lagrange equations. In order to show 
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this for the general case (/ > 1), calculate o)\ (X^. Y) as well as d E{Y), for an 
arbitrary vector field 



Y = w‘dj . 

One finds that 



d l L 



^ r> = E 



i,k 



d z L 



y - v k w‘) 

y dq k dq' 



and, similarly. 



^ , d 2 L d 2 L dL ,, > , 



(5.114) 



(5.115) 



We insert v‘ = q 1 in (5.114) and set it equal to (5.115). The terms in w k cancel, 
while the comparison of the coefficients of w k yields the equations 



dL ^ d~L . ^ 9 2 T _j 

3 q k y dq'dq k “ dq'dq k 



Finally, inserting the result (5.113), these equations become 



dL d 3 L _ 

3 q k dr dq k 

i.e. the set of Euler-Lagrange equations, as expected. 



5.6.5 The Legendre Transformation and the Correspondence 
of Lagrangian and Hamiltonian Functions 

We had assumed the mapping 0\ (5.103) from T„Q to T* Q to be a diffeomor- 
phism. Locally this means that the condition (5.105) is satisfied everywhere. As 
we learnt in Chap. 2, Sect. 2. 15, we can then go over from Lagrangian mechanics 
to Hamilton-Jacobi mechanics and vice versa, as we wish. In this section we want 
to clarify this relationship using the geometric language. 

With 0[ a diffeomorphism, geometric objects can be transported between T Q 
and T* Q at will. For example, if X : T Q — > T(T Q) is a vector field on T Q , then 

Y = f T0 l o X o 0~ l : T*Q T(T*Q ) 

is a vector field on the manifold T*Q. Here, T0 l is the tangent mapping cor- 
responding to 0\ . It relates T(T Q) with T(T*Q). As 0\ is a diffeomorphism, 
T 0\ is an isomorphism. In this case one has the following results. 
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(i) Proposition. Let the Lagrangian function be such that <P\ is a diffeomor- 
phism. Let E be the function on T Q defined by (5.109b). Finally, define 
the function 

H = E o<P~' : T*Q^R 

on T*Q. Then the Lagrangian vector field Xe and the vector field Xh, 
which corresponds to H , by the definition (5.91), are related by 

T<P l o X e o <P[ l = X H . (5.116) 

@e maps the integral curves of Xy onto those of Xh. The two vector fields, 
Xe on T Q and Xh on T* Q, have the same base integral curves (i.e. the 
same physical solutions on Q). 



Proof. It is sufficient to establish the relation (5.116) because the remaining as- 
sertions all follow from it. Let v e T Q, w e T V (T Q), let v* be the image of v by 
0 E , and iv* the image of w by the tangent mapping T ( P\_. i.e. w* = T v ®\ (w). 
At the point v we then have 

a>oO T <Z>l(*e), w*) = ol>e(Xe. w ) = d E(w) = d (H o &e){w) . 

On the other hand, at the point v* — 0\fv), we know that 

a> 0 (T w*) = d H(w*) = «o(X H , w*) . 

The assertion (5.1 16) now follows because w* is arbitrary, T 0\ is an isomorphism, 
and w o is not degenerate. It is then also clear that the integral curves of Xe and 
Xh are related by 0\ . Finally, denoting the projections from T Q and from T*Q 
to Q by tq and by r q, respectively, we know that tq — Tq o <1>\ . Hence, the base 
integral curves are the same. □ 

(ii) The canonical one-form &>o (5.74) is closely related to the action W, 
(5.109a). With H — E o 0 E l , one has 

6>o(X h ) = Wo . (5.117a) 

. def 

Conversely, if 6\ — PyOo is the pull-back of the canonical one-form on T Q, 
then 

0o(X E ) = W. (5.117b) 

In charts this is easy to verify. For example, (5.1 17a) is equivalent to the statement 
that 0 q(^h) ° ‘t’h is equal to W. We have 

x , dH 

do(x H ) — y t pi — 

; 3 Pi 



so that, indeed 
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e Q (X H ) O0 L = V —q‘ = 1¥ . 

' da 1 

l 

(iii) A transformation analogous to (5.103) can also be defined for the inverse 
direction, i.e. going from T*Q to T Q. Let H be a smooth function on T*Q. In 
analogy to the definition (5.103) let us define the transformation 

<P H :T*Q-> T**Q = TQ . (5.118) 

If this mapping 0\\ is a diffeomorphism 8 , one can define the quantities 

E = H O 0~ 1 , W = Oo(X K ) o 0~ l , L = W-E (5.119) 

in analogy to (ii) above. This yields a Lagrangian system on T Q with L the La- 
grangian function. For this L we again construct the mapping 0\ (5.103). It then 
follows that <Jl = , or 0^ ° & H = idr^ and 5>h ° 0~l = id tq- This leads 

to the following theorem. 



Theorem. The Lagrangian functions on T Q for which the correspond- 
ing mappings <2>l are diffeomorphisms, and the Hamiltonian functions for 
which the corresponding ( P\\ are diffeomorphisms, correspond to each other 
in a bijective manner. 



The proof, which is simple, makes use of the tools introduced above, see e.g. 
Abraham and Marsden (1981). Thus, under the assumptions stated above, there 
is a one-to-one correspondence between the two descriptions of mechanics. The 
relationship between them is illustrated once more in Fig. 5.14. 

T<i>H 

T (TQ) - » TirQ) 




TR TR spond bijectively 



As is easy to guess, this is true if det(3“ H /dpfcdpj) vanishes nowhere. 
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Note that most systems studied in nonrelativistic mechanics have this property. 
As a counterexample, however, we remind the reader of the relativistic description 
of a free particle, Sect. 4.11: in this case the Lagrangian function did not have the 
regularity required for 0\ to be a diffeomorphism. 



5.7 Riemannian Manifolds in Mechanics 

A Riemannian manifold (M, g ) is a differential manifold M equipped with a met- 
ric g. Differential or smooth manifolds are defined in Sect. 5.2.2; the metric is a 
smooth tensor field of type 7 ?(M), and its properties are summarized in defini- 
tion (ME) in Sect. 5.5.1. As can be seen from (5.67b), or from (5.69), the metric 
defines a scalar product on T p M, the tangent space attached to the point p e M. 
This scalar product is often written in the “bra” and “kef’ notation, i.e. by 
making use of the symbols (. . . | and | . . .), such that 

g p {v, w) = (v\w ) , v,weT p M. (5.120) 

The phase space of a Hamiltonian system is a symplectic manifold, cf. the defini- 
tion (S2) in Sect. 5.5.4. Symplectic manifolds are very different from Riemannian 
manifolds: While all symplectic manifolds look the same locally, this is not true 
for Riemannian manifolds. The first statement is the content of Darboux’ theorem 
(Sect. 5.5.4) which may be expressed in more physical terms by the statement that 
locally and outside of equilibrium positions, any Hamiltonian vector field can be 
rectified (cf. Sect. 2.37. 1) 9 . 

In this section we show that for certain systems of Lagrangian mechanics the 
coordinate manifold Q can be interpreted as a Riemannian manifold with the metric 
as defined by the kinetic energy; and that solutions of the Euler-Lagrange equations 
are nothing but geodesics of Q. In this way we discover another illustration and 
example of the geometrical nature of mechanics; at the same time we prepare the 
ground for general relativity, which is a geometrical theory, in an even deeper sense. 

In what follows we first introduce the notions of parallel transport and affine 
connection that one needs in order to define parallel vector fields and to write 
down the geodesic equation. We then show that geodesics are solutions of Euler- 
Lagrange equations and conclude with a beautiful application of this somewhat 
formal chapter. 



9 The global properties of symplectic manifolds are the subject of an important research field of 
mathematics. The present state of the art is described in the book by Hofer and Zehnder (1994). 
This book should be readily accessible for the mathematically minded reader. 
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5.7.1 Affine Connection and Parallel Transport 



To begin with, let M simply be a Euclidean space R”, equipped with the metric 
defined in (5.5) and (5.6). Let W = W 3, and V — V' 9, be smooth vector 
fields on M, V p e T p M the local representative of V at the point p. We now ask 
the question of how W at p will change in the direction of V p . The answer is 
simple in this case: At the point p we let V act on the functions (the components) 
W' (p) and use the result to construct the vector field ViW' )'dj. This is the local 

and natural expression for the covariant derivative of W with respect to V 



n 

D V (W ) = V{W l )di 

i = 1 



= 2 >* 

i,k= 1 



3 W' 
dx k 



3 i. 



(5.121) 



Obviously this expression is linear in W . Regarding its dependence on V it is also 
possible to calculate the covariant derivative along the sum of two vector fields, 
viz LVj+viVf = Dy l W + Dy 2 W, as well as along the vector field / • V, where 
/ is a smooth function on M, viz DfyW — f(DyW). However, letting Dy act 
on the vector field (/ • W) is a different matter; one finds 

n n n 

Dy(fW) = £>(/w') 3, = (Vf)J^w i d i + fJ2v(w‘)di 

1=1 1= 1 i= 1 

= C Vf)W + fDvW . 



This formula expresses a generalized product rule, or Leibniz rule. 

In case of a smooth manifold M which is not R" the formula (5.121) no longer 
holds, and there is no obvious and natural definition of a covariant derivative. Ask- 
ing the question of how a vector field W changes along the direction of another 
vector field means that we have to compare elements of two distinct tangent spaces, 
say W p e T p M with W q e T q M. In order to make such a comparison possible we 
first need to know how to transport W p in a parallel fashion from T p M to T q M (by 
means of a vector space isomorphism). Only then can one compare the result of the 
parallel transport with W q . As parallel transport, in general, is not given in a canon- 
ical way, an explicit rule is necessary. It needs to be constructed in a manner consis- 
tent with what we know from the flat space R" . Fixing the rule of parallel transport 
on a smooth manifold means choosing what is called a connection. The example 
studied above suggests the following defining properties of a connection D: 
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CONN. A connection D on a smooth manifold M is a mapping 



D : V(M) x V(M) — > V(M) , (5.122) 

which has the following properties: 

(i) It is T(M (-linear in the first argument, that is to say 

D Vl +v 2 W = D Vl W + Dy 2 W , (5.123a) 

D fv W = f(D v W); (5.123b) 

(ii) it is R-linear in its second argument, that is 

D v (XiWi +X 2 W 2 ) = XiDyiWi) + X 2 D V (W 2 ) , 

X l ,X 2 eR; (5.124) 

(iii) it obeys the Leibniz rule 

D v (fW) = (Vf)W + fD v W, f e T(M) . (5.125) 



The vector field Dy W is called the covariant derivative of W along V and with 
reference to the connection D. 

Clearly, the parallel transport is fixed if its action on all base vectors is known. 
Therefore, if in a local chart we choose V = 3, and W — dj, the result is again 
a vector field which can be expanded along base fields, 

n 

D, k (3/) = X] ^ ' < 5 - 126 ) 

k=i 

This equation defines the Christoffel symbols of the connection D. For exam- 
ple, if one computes the covariant derivative of a vector field W along the base 
field 3 equations (5.125) and (5.126) yield the following local expression 

D * (e = E + E r u wJ j • (5- 127 ) 

One of the central theorems of Riemannian geometry is the following: Among 
the set of connections on a Riemannian manifold M there is a special, uniquely 
determined connection which, in addition to (5.123-125) has the properties 

[V, W] — D V W — D W V , (5.128) 

x(v\ w) = ( Dx\W ) + ( V\D X W ) for all A, V, W e V(Af) . (5.129) 

This special connection is called the Levi—Civita connection. 
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The first of the additional properties (5.128) says that the commutator (5.32) 
of the vector fields V and W equals the difference of the covariant derivative of 
W along V and of V along W. By applying (5.128) to two base fields 3/ and 3; 
we see that the Christoffel symbols are symmetric in their lower indices, 

^5 = / 5 - (5 - 130) 

Indeed the left-hand side vanishes because the base fields commute; the right-hand 
side, according to (5.126), gives (5.130) 10 . 

As it also possesses the property (5.129) the Levi-Civita connection is said 
to be metric. Indeed, the covariant derivative can also be applied to other smooth 
objects defined on M such as the metric g. One can show that (5.129) is equivalent 
to the condition Dg = 0, which says that the covariant derivative of the metric 
along any smooth vector field vanishes. 

In local coordinates the Christoffel symbols can be expressed in terms of deriva- 
tives of the metric tensor gg, as well as by its inverse g km . We skip this calculation 
and simply quote the result 



pk \ ' km ( jm d§mi ^Sij\ 

'■i 2 ^ \ dx' 3 xJ dx' n ) 

m N 7 



(5.131) 



The symmetry (5.130) is obvious in this explicit formula. 



5.7.2 Parallel Vector Fields and Geodesics 



A smooth curve a : I C R T M on the manifold M is itself a smooth, one- 
dimensional manifold. Consider a smooth vector field Z e V(a) on this subman- 
ifold of M. Let r be the parameter describing the curve, let the dot denote the 
derivative with respect to r and let a be its tangent vector field. The derivative of 
Z with respect to r can then be computed as follows, 



* = £ 

k 



d Z k 
dr 



3 k + £ Z k D a (d k ) = 

k k 




+ £ r hn 



dix 1 o a) 
dr 



Z m 



a*. 



This is the rate of change of Z as one moves along the curve. In particular, if Z = 0 
the vector field Z is said to be parallel. Given a tangent vector z e T a ( If)) M to 
the point Q’(to) on the curve we can now state precisely how to perform parallel 
transport of a given vector along the curve a. In particular, for every smooth curve 
a : I -» M there is a unique parallel vector field Z such that at r = to it equals 
a given tangent vector, say Z(ro) = z. 

A case of special interest is when Z — a, i.e. where Z is the tangent vector field 
of a curve a. Obviously, Z is then none other than the acceleration a. Geodesics, 
from the point of view of physics, describe motion of free fall on the manifold. 



10 



The condition (5.128) expresses the fact that the Levi-Civita connection has vanishing torsion. 
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i.e. motion with vanishing acceleration. Geometrically speaking they are curves on 
the manifold which link arbitrary points p and q such that the length of the arc pq 
is extremal. An elementary example is provided by the unit sphere in three dimen- 
sions, M = S 2 , where the geodesics are the great circles. The geodesic distance 
between any two points A, B e S 2 is either a minimum (if the smaller segment 
of the great circle joining them is chosen), or a maximum (if the larger segment 
is chosen). If A and B are antipodes, the geodesic length corresponds to a saddle 
point (cf. Sect. 2.36). 

These remarks illustrate the geometrical definition of geodesics. 

Geodesics on a smooth Riemannian manifold are smooth curves y : I — > M whose 
tangent vector field y is parallel. 

This definition and our previous remarks allow us to write down a differential 
equation for geodesics in local coordinates. It reads 

^(x l °y) + J2 r jk(Y)-^(x J o y)^(x* o y) = 0. (5.132a) 

jk 

Here, the functions (x l o y) are coordinate functions on the curve y . As their mean- 
ing is obvious and as there is no real danger of confusion one simplifies the notation 
by writing just x 1 for short. The geodesic equation then takes the simpler form 

x l + ^r l jk (y)x j x k = 0. (5.132b) 

jk 



5.7.3 Geodesics as Solutions of Euler-Lagrange Equations 

As we have seen, geodesics describe force-free, unaccelerated motion on a given 
manifold. They are curves whose length is an extremum and, therefore, they are 
solutions of Euler-Lagrange equations. This is the content of the following theorem 



Theorem on geodesics. Let (Q, g) be a Riemannian manifold and 
L:TQ^R, L{v) = \[v | v ) 

a Lagrangian function. A curve y is a solution of the Euler-Lagrange equa- 
tions if and only if it is geodesic on Q. 



Proof: In local coordinates the Lagrangian function reads 

L(v) = \ J2 8i J^ v ' vJ - \ ^8u(q)q'q J ■ 

ij U 



Lagrange’s equations (2.18) yield 
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d 
d t 




-Y 

jk 



w^‘ =0 - 



We calculate the time derivative in the first term 



(5.133) 



d 

dr 




j U 



multiply the entire equation from the left with the inverse g h of the metric tensor, 
and sum over i to obtain the differential equation 



V- li ( d 8i.i _ [ dgjk 

\dq k 2 dq' 
ijk 



q J q k 



q' 






ijk 



0. 



In the second step we have written the first term of the expression within brackets 
twice by making use of its symmetry in j and k. In its second form, upon inserting 
the formula (5.131) for the Christoffel symbols, the differential equation becomes 
precisely the geodesic equation (5.132b). This proves the theorem. 

Remark: With L = T = gi^q' q k /2 and with T the kinetic energy, (5.133) shows 
that the geodesic equation has the form of (2.18). The integral 



k := dr JY 8ik(q(r))q'q k = dr Vff (5.134) 

is the length of the curve with boundary values y( ri) = a and y( T 2 ) — b. As 
long as T does not vanish these geodesics are curves whose length X is extremal 
because as f / 0 



d dVf dVf _ _ 1 / d dT dT\ 

dr dq' dq' 2^ff \dr dq 1 dq' J 



5.7.4 Example: Force-Free Asymmetric Top 

We wish to conclude this chapter by illustrating these general results by means 
of a particularly beautiful example 11 : We show that Euler’s equations (3.59) are 
geodesic equations on the Riemannian manifold M — SO(3), with the metric being 
determined by the inertia tensor J. 

We start by recalling that by using S(y>) = XEi Vih the rotation matrix 
(3.45a) can be written as an exponential series in S and that the action of the 



11 



V.I. Arnol'd: Ann. Inst. Fourier 16 , 319 (1966) 
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latter on any vector equals the cross product of <p with that vector (cf. Sect. 2.22). 
Thus, in symbols 

R(<p) = exp{— S((p)} and S(<p)x = <p x x . 

■ T 

The action of the matrix Si{ r) = R (r)R(r), eq. (3.53), on a vector in the 
laboratory system is that given in (3.56b). Clearly, these formulas can be rotated 
to the body-fixed system, 

Six = a> x x , where a> — R&> , Si — Ri2R 7 . (5.135) 

The Lagrangian function is equal to the kinetic energy expressed in the body-fixed 
system, 

1_ 

L — T = -oo ■ J • (o . (5.136) 

Let R(r) be a smooth curve on the manifold M = SO(3) which assumes the 
boundary values R(ri) = Rj and R(t 2 ) = R 2 and which is such that the length 
(5.134) is an extremum. We show that any such geodesic obeys Euler’s equations 
(3.58) for vanishing external torque. 

Let R(r) be a geodesic, Ro e SO(3) a constant, fixed rotation. We compute 

_ J 

j- (R 0 R(f)) T (RoR(r))= j-R(r) RjR 0 R(r) = R T (r)R(r) . 
dr J J 

From this we conclude that Si, and hence also oj as well as (D remain unchanged. 
This means that if R(r) is a geodesic, so is (RoR(t)). Therefore, it is sufficient 
to discuss the special geodesic which goes through R(r = 0) = 1. In this case 
R(0) = Si( 0). We compute Si in the neighborhood of ip = 0 as follows 

H = Ri?R _1 = R(r)R T (r) 

= (1-S + ...) (s+ 1 -ss+ 1 -ss + ..)j 

= S- l -[S,S] + 0((p 2 ), 
and use the identity (cf. Sect. 3.12) 

[S, S] = (S (</>), S (</>)] = S(<p x <p) . 

Note that <p and ip here are independent variables and need not have the same 
direction. We conclude that 

Si = S {ip) - ^S(<p x ip) + 0(<p 2 ) . 

On the other hand (5.135) implies that Si — S(w) and we conclude that 
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(0=<p- -<p x <p + G(<p-). 



(5.137) 



Inserting (5.137) into (5.136) and keeping track of the symmetry of the inertia 
tensor we have 



T = • J • ip - \^ip • J • (<p x ip) + 0(<p 2 ) . 

In calculating, in a next step, the derivatives with respect to tp and to ip it is 
useful to rewrite the second term of T by using the identities a(bxc) = b-(cxa) = 
c ■ (a x b) with a — (<p ■ J) r = J • ip, b — <p, and c = ip. To first order we find 



dT 1 1_ _ 

7— = --zV x (J ip) + 0{<p) = --CO X (J <0) + O(<0) . 
o(p 2 2 



In much the same way one finds 



d dT 
dr d(p 



= 3<p- -(J <p) x ip 



O(qi) — J co + -co x (J w) + 0(<p) . 



Thus, taking (p — 0 one recovers the geodesic equation 
d dT 8T _ _ 

=Jw + wx(Jw) = 0. (5.138) 

dr dip dip 



This equation is identical to Euler’s equations (3.58), with D = 0. Thus, Euler’s 
equations of motion have a simple geometrical interpretation which is helpful in 
visualizing their content: 

The spinning top without external forces follows geodesics of the smooth manifold 
SO( 3). 




6. Stability and Chaos 



In this chapter we study a larger class of dynamical systems that include but go 
beyond Hamiltonian systems. We are interested, on the one hand, in dissipative 
systems , i.e. systems that lose energy through frictional forces or into which energy 
is fed from exterior sources, and, on the other hand, in discrete, or discretized, sys- 
tems such as those generated by studying flows by means of the Poincare mapping. 
The occurence of dissipation implies that the system is coupled to other, external 
systems, in a controllable manner. The strength of such couplings appears in the 
set of solutions, usually in the form of parameters. If these parameters are varied 
it may happen that the flow undergoes an essential and qualitative change, at cer- 
tain critical values of the parameters. This leads rather naturally to the question of 
stability of the manifold of solutions against variations of the control parameters 
and of the nature of such a structural change. In studying these questions, one re- 
alizes that deterministic systems do not always have the well-ordered and simple 
behavior that we know from the integrable examples of Chap. 1, but that they may 
exhibit completely unordered, chaotic behavior as well. In fact, in contradiction 
with traditional views, and perhaps also with one’s own intuition, chaotic behavior 
is not restricted to dissipative systems (turbulence of viscous fluids, dynamics of 
climates, etc.). Even relatively simple Hamiltonian systems with a small number 
of degrees of freedom exhibit domains where the solutions have strongly chaotic 
character. As we shall see, some of these are relevant for celestial mechanics. 



6.1 Qualitative Dynamics 

In the preceding chapters, we dealt primarily with fundamental properties of me- 
chanical systems, with principles that allowed the construction of their equations of 
motion, and with general methods of solving these equations. The integrable cases, 
although a minority among the dynamical systems, were of special importance be- 
cause they allowed us to follow specific solutions analytically, to appreciate the 
significance and the power of conservation laws, and to study the restrictions that 
the latter impose on the manifold of motions in phase space. 

On the other hand, there are questions to which we have paid less attention so 
far; for example: What is the long-term behavior of a periodic motion that is subject 
to a small pertubation? What is the structure of the flow of a mechanical system 
(i.e. the set of all possible solutions) in the large? Are there structural, characteristic 
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properties of the flow that do not depend on the specific values of the constants 
appearing in the equations of motion? Can there be “ordered” and “unordered” 
types of motions, in a given system? If yes, can one define a quantitative measure 
for the lack of “order”? If a given system depends on external control parameters 
(strength of a perturbation, amplitude and frequency of a forced vibration, varying 
degree of friction, etc.), are there critical values of the parameters where the flow 
of the system changes its structure in the large? 

These questions show that, here, we approach the analysis of mechanical sys- 
tems in a somewhat different spirit. The equations of motion are assumed to be 
known (even though they may depend on control parameters that can be varied). 
We concentrate less on the individual solution but, instead, study the flow as a 
whole, its stability, its topological structure, and its behavior over long time peri- 
ods. It is this kind of analysis we wish to call qualitative dynamics. Quite logically, 
it leads one to investigate the stability of equilibrium positions and of periodic or- 
bits, to study attractors for dissipative systems (i.e. manifolds of lower dimension 
than the original phase space, to which the system tends, for large times, under the 
action of dissipation), to study bifurcations (i.e. structural changes of the flow at 
critical values of the control parameters), and to analyse the pattern of disordered 
motion if it occurs. 



6.2 Vector Fields as Dynamical Systems 

The dynamics of a very great variety of dynamical systems can be cast in the form 
of systems of first-order differential equations, viz. 



d 



—x(t) = F(x(t), t ) 



( 6 . 1 ) 



Here, t is the time variable, x(t) is a point in the configuration space of the system, 
and F is a vector field that is continuous and often also differentiable. The space 
of the variables x may be the velocity space, described locally by generalized co- 
ordinates q' and velocities q l , or the phase space that we describe locally by the 
q' and the canonically conjugate momenta p,. There are, of course, other cases 
where the x live in some other manifold: an example is provided by the Eulerian 
angles that parametrize the rotational motion of rigid bodies. 

As an example, let the equation of the motion be given in the form 



y + fi O', t)y + / 2 (y, t) = 0 . 



It is easy to recast this in the form of (6. 1 ), by taking 



, N UCl , , - , UCl . , . 

*t(0 = y(t) , x 2 (t) — y(t ) , 



so that ii = X2, X2 = —f\X2 — fi- The pattern (6.1), of course, is not restricted to 
Lagrangian or Hamiltonian systems. It also describes systems with dissipation, that 
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is systems where either mechanical energy is converted to other forms of energy 
or where energy from external sources is fed into the system. Thus, (6. 1 ) describes 
a large class of dynamical systems, defined on the space of x variables and the 
time axis R f . This equation is a local expression of the underlying physical laws. 
For instance, it relates the acceleration at every point of space and at each time 
t with the given field of forces. In this sense it determines the dynamics locally, 
“in the small”. The temporal evolution of the system, starting from an arbitrary 
but fixed initial configuration, will be known only when we have the complete 
solution of the differential equation (6.1) that obeys this initial condition. As an 
example, consider the Kepler problem (Sect. 1.7.2) for a given initial position r o 
and initial velocity r o of the relative coordinate. Take To = /r /■ ( 2 } / 2 to be smaller 
than |C/(ro)| = A/ro (where A — Gin] m 2 ) and take l = \ro x r o\/x to be differ- 
ent from zero. The specific solution that assumes this initial configuration is the 
Keplerian ellipse with parameters I 2 / A/i and 

£ = -y/ 1 + 2(7b + U (ro))l 2 / [iA 2 . 

This specific solution, though, gives little information on the general dynamics 
of mass points in the field of the gravitational force F — —VI/. Only when we 
know the solutions for all allowed initial configurations do we learn that, besides 
ellipses and circles, the Kepler problem also admits hyperbolas and parabolas as 
the typical scattering orbits. In other words, the diversity of the dynamics hidden 
in an equation such as (6.1) will come to light only if one knows and understands 
all solutions, i.e. the complete flow of the vector field F. 

These remarks apply to a system whose law of motion is given once and for 
all. In the case of real physical systems, this assumption is true only in exceptional 
situations, for the following reasons. 

(i) It may happen that the force law is not known exactly. Its explicit form 
may contain one or several parameters that one whishes to determine from the 
observed motions. Here is an example: if one doubts the long-range character of 
the Coulomb potential between two point charges ei and ej , one might assume 
U (r) = e\e 2 /r“ with a — 1 + s, with the idea of studying the dependence of the 
corresponding dynamics on the parameter s (see also Practical Example 1.4). 

(ii) The vector field F on the right-hand side of (6.1) describes the influence 
of an external system that might be varied. An example is an oscillator that is 
coupled to an external oscillation of variable frequency and variable amplitude. 

(iii) It may be that the differential equation (6.1) contains a predominant force 
field for which all physically allowed solutions are known. In addition, it contains 
further terms that describe the coupling of the system to other, external systems, the 
coupling being weak enough so that they may be taken to be small perturbations of 
the initial, soluble system. This is the situation that we studied in Sect. 2.38-2.40. 

In all cases and examples quoted above, the vector field F contains additional 
parameters that can be varied and that may have a decisive influence on the man- 
ifolds of solutions. For example, it may happen that the solutions of (6.1) change 
their structure completely once the parameters cross certain critical values. Sta- 
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ble solutions can turn into unstable ones, a periodic solution, by the bifurcation 
phenomenon, can double its frequency, etc. 

From these remarks it is clear that the task of studying deterministic dynamic 
systems on the basis of their equation of motion (6.1) is a very ambitious one. 
When formulated this generally, this field of differentiable dynamics , by far, is not 
a closed subject. On the contrary, there are only relatively few rigorous results and 
a number of empirical results based on numerical studies. Therefore, studying this 
branch of mechanics leads one very quickly into the realm of modern research in 
this field. 

6.2.1 Some Definitions of Vector Fields and Their Integral Curves 

In this section we take up the tools introduced in Chap. 5 and discuss some con- 
cepts that are important for studying vector fields as dynamic systems. The local 
form of (6.1) is sufficient for an understanding of most of what follows in subse- 
quent sections. Therefore, the reader who is not used to the geometrical language 
may skip this section. On the other hand, if one wishes to learn more about the 
subjects touched upon in this chapter, some knowledge of the content of Chap. 5 is 
mandatory, as the specialized literature and the research in this field make extensive 
use of the concepts and methods of topology and differential geometry. 

In reality, (6.1) is a coordinate expression of the differential equation (5.42) for 
integral curves of a smooth vector field T on the manifold M. In physics, typically 
M is the phase space T*Q or the velocity space T Q , i.e. the cotangent or tangent 
bundles of the coordinate manifold, respectively. 

The curve & m : 1 -> M is an integral curve of T if the tangent vector field 
0 m coincides with T<p m ( t } , the restriction of T to points along the curve 0 m , 

i’m U) — , f € / C K( , T € X(M) . (6.2) 

I is an open interval on the time axis K, that contains the origin t = 0. The inte- 
gral curve 0 m is chosen such that it goes through m at time zero. (We adopt the 
notation of Sect. 1.19 because we shall use results from there.) 

In the coordinates of the chart (yp, U ) we obtain the differential equation (5.43), 
i.e. 

^-(x l o& m ) = r(x k o0 m ,t) , (6.3) 

dr 

or, in a somewhat simplified notation, (6.1). 

Somewhat more generally, we have the following. For each m o of M there is 
an open neigborhood V on M, an open interval I on the time axis containing the 
origin t — 0, and a smooth mapping 

0 : V x I -» M , (6.4) 

such that, for every fixed m e V, the curve 0(m, t) is the integral curve 0 m il) = 
0(m, t) of T that goes through m at time t — 0, 0(m. t — 0) = m. The theorem 
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of Sect. 1.19 guarantees the existence and uniqueness of this integral curve. 0 is 
said to be the local flow of the vector field T and the integral curves 0 m : I -> M 
are said to be the flux or flow lines of 0. Keeping the time variable in 0(m, t) 
fixed and letting m wander through V , we obtain the flow fronts 

0 t (m) = 0(m, t) (6.5) 

of the flow 0. This local manifold of solutions may be visualized as shown in 
Fig. 6.1. Given a fixed time tel, each point of the domain V flows along a cer- 
tain section of its integral curve 0 m . The domain as a whole moves on to 0,(V ). 




Fig. 6.1. During time t, the flow of a vector field trans- 
ports a domain V to V' = <£v(V). The figure also shows 
the orbits along which the points of V move during this 
time 



If /(,„) is the maximal allowed interval on the time axis for which 0 m exists, 
0 m is unique and is said to be the maximal integral curve through m. Applying this 
reasoning to every point of M yields a uniquely determined, open set Q c MxR,, 
on which the maximal flow 0 : £2 —*■ M of the vector field T is defined. This 
leads to the following. 

Definition. A vector field T is said to be complete if Q — M x R,, i.e. if its 
maximal flow is defined on the whole manifold and for all times. 

The Hamiltonian vector field of the harmonic oscillator provides an example 
of a complete vector field, 

, , / dH dH\ 

(r).(x' (6.6) 



Its maximal flow 



, . , , . / cost sin A ( q\ 

0{m = q , p),t) = . ' 

1 y y-smr cos t)\p) 

is defined on the whole phase space (this is the example of Sect. 5.3.1 with to = 0). 
In practice there are examples of vector fields that are not complete. For instance, 
in the Kepler problem the origin of the potential must be cut out of the orbit plane 
because of its singularity at this point. The corresponding Hamiltonian vector field 
then ceases to be complete on R 2 . Similarly, in relativistic mechanics and in general 
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relativity there are vector fields (such as velocity fields of geodesics) that are not 
complete. For a complete vector field, 0 is a global flow, 

0 : M x K, -* M , (6.7) 

that may be interpreted in yet another way. As in (6.5) let us keep the time fixed. 
The flow (6.7) then generates a smooth mapping of M onto itself, 

H e f 

0, : M — > M : m i — > 0,(m) — 0(m, t) , (6.8) 

which has the following properties. For t = 0 it is the identical mapping on M, 
00 = id m- Taking the composition of (6.7) with itself, twice or several times, one 
obtains 

0 t + s = 0/o 0 S for ueR, . 

For each t, 0, is a diffeomorphism of M. The inverse of 0, is 0-t- In this way 
we obtain a one-parameter group of diffeomorphisms on M, generated by the flow 
0 and the assignment t i->- 0 t . Thus every complete vector field defines a one- 
parameter group of diffeomorphisms. Conversely, a group that 

depends on a real parameter defines a complete vector field. 

6.2.2 Equilibrium Positions and Linearization of Vector Fields 

Suppose the laws of motion of a physical system are described by the local equation 
(6.1) or, more generally, by an equation of the form of (6.2). A set of differential 
equations of this kind is called a dynamical system (although, strictly speaking, 
only the set of all solutions to these equations describes the dynamics). For what 
follows, we treat dynamical systems in the simplified form of (6.1), i.e. in the form 
of differential equations on R”. For manifolds that are not Euclidean spaces this 
means that we work in local charts. Exceptions regarding dynamical systems on 
more general, smooth manifolds will be mentioned explicitly. 

A point vq is said to be an equilibrium position of the vector field F if 
F(xq) = 0. Equivalently, one also talks about a singular or critical point of the 
vector field. For an autonomous system, for example, (6.1) becomes 

x{t) = F(x(t)) . (6.9) 

At a critical point xq the velocity vector vanishes so that the system cannot move 
out of this point. However, as such, (6.9) says nothing about whether the configu- 
ration xq is stable or unstable against perturbations. One learns more about this if 
one linearizes (6.9) about the point xo- For this purpose we introduce the following 
definitions. 

(i) Linearization in the neighborhood of a critical point. In the terminology of 
Sect. 6.2.1 the linearization of a vector field at a critical point mo is defined to be 
the linear mapping 
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T'(m 0 ) : T mo M T mo M , 



which assigns to every tangent vector v e T mo M the derivative 



T\m 0 ) ■ v = — (T 0 mo (t) ■ u)| f=0 



Here 0 is the flow of T and T0 is the corresponding tangent mapping. 

In the simplified form of (6.9), valid on R" , linearization means simply that 
we expand about the point x = xq. Thus, take y — x — xq and F(x o) = 0. From 
(6.9) we obtain the differential equation 



k=l 



y\t ) 

X 0 



or, in more compact notation. 



(6.10a) 



y(t) = DF\ Xo -y{t) . (6.10b) 

This is a differential equation of the type studied in Sect. 1.21. The symbol D F 
denotes the matrix of partial derivatives, very much as in Sect. 2.29.1. For an au- 
tonomous system (6.9) this matrix is independent of time. The linear system (6.10) 
obtained from it is homogeneous and autonomous. 

The following case is more general (see Exercise 1.22 and the example of 
Sect. 1.26). 

(ii) Linearization in the neighborhood of a given solution. Let 0(1) be a so- 
lution of (6.1) and let y(t) — x(t ) — 0(t). Then, from (6.1), 



y(t) = F (y (t ) + 0(1), t) - 0(1) = F(y(t) + 0(1), 1) - F(0(t), t) . 



Expanding the right-hand side in a Taylor series about the solution 0(t) yields the 
linear and homogenous differential equation 



dF' 



y i v = H^ x k(? = , ?V’ t )y k v 



(6.11) 



where the partial derivatives of F must be taken along the orbit 0(t). Even if F 
has no explicit time dependence, the linearized system (6.11) is not autonomous. 
It becomes autonomous only if the specific solution is chosen to be an equilibrium 
position, 0(t) — X(), taking us back to the first case (6.10). 

In the simpler case of linearizing an autonomous system about an equilibrium 
position we obtain the linear, homogeneous, and autonomous system (6.10), i.e. 



y(t) = A y(t) 



( 6 . 12 ) 



with the matrix A being given by 
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The linear system (6.12) can be solved explicitly. The solution that fulfills the 
initial condition y(s) = yo is 

y(t) = ft,Ayo) = exp[(f - ■s)A]yo (6.13) 



with 'I / s ,s(yo) — y o and with 

oo / \ n 

exp[(t - s)A] = V] - — - A" . 
Z — ' n\ 

n = 0 



If A is given in diagonal form, this series becomes particularly simple. With a,- 
denoting the eigenvalues of A the exponential series also has diagonal form, its 
eigenvalues being exp(A.ctr,-) with A. — t — s. For this reason, the eigenvalues of the 
matrix A = D F are called characteristic exponents of the vector field F at the 
point xq. 

For the sake of illustration we consider two examples. The first is the example 
of Sect. 1.21.1, which is understood to be the linearization of the plane pendulum 
at the point x = 0. From (1.46) 



A = 





The eigenvalues of A are easily found. From the characteristic equation det (a 11 — 
A) = 0 one finds ot\ — i&>, ai — — i at, so that the diagonalized matrix is 




In the second example we add a friction term to the plane pendulum, pro- 
portional to the velocity of the motion. Thus, in linearized form we obtain the 
differential equation 

mq + 2ymq + marq — 0 , (6.15) 



where y is a constant with the dimension of a frequency. 

Using the notation of Sect. 1.18, v 1 = q, y 2 — mq, (6.15) becomes 




with 




1 / rn \ 
—2y ) ■ 



The eigenvalues of A are computed as in the previous example. For y 2 < o> 2 
(this is the case of weak friction) one finds two, complex conjugate characteristic 
exponents 



o 

\y\ < co : A = 



l—y + i sjw 2 - y 2 

V 0 



0 \ 

-y - fy/w 2 - y 2 J 



(6.16a) 
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For y 2 > or (this is the aperiodic limit) one finds two real characteristic exponents 
both of which have the same sign as y. 



o 

\y\ > co : A = 



l—y + \/y 2 - co 2 

V 0 



0 ) 

-y - y/y 2 -co 2 J 



(6.16b) 



In all cases, vo = (y 1 , v 2 ) = 0 is an equilibrium position. In the case of damped 
motion, (6.16a) and (6.16b) show that all solutions (6.13) approach the origin ex- 
ponentially as t goes to infinity. This point is certainly one of stable equilibrium. 
If, on the other hand, y < 0, the oscillations are enhanced and every initial con- 
figuration except vo = 0 moves away from the origin, no matter how close to 0 it 
is chosen. In this situation the origin is certainly a point of unstable equilibrium. 

In the case of purely harmonic oscillations (6.14) the origin is again stable but 
in a weaker sense than with positive damping. Indeed, if we perturb the oscillator 
a little from its position of rest, it becomes a stationary state of motion with small 
amplitude. It neither returns to zero nor moves away from it for large times. The 
origin is stable but, obviously, its stability is of a different character than for the 
damped oscillator. Let us study these different kinds of stability in more detail. 



6.2.3 Stability of Equilibrium Positions 

Let xo be a stable critical point of the vector field F, i.e. F(x o, t) = 0 and xo 
is an equilibrium position of the dynamical system (6.1) or (6.9). The notion of 
stability of the critical point is qualified by the following definitions. 



51. The point xq is said to be stable (or Liapunov stable ) if for every neigh- 
borhood U of xo there is a further neighborhood V of xq such that the 
integral curve, that, at time t — 0, goes through an arbitrary point x e V, 
exists in the limit t — > +oo and never leaves the domain U. Thus, when 
expressed in symbols, we have for x e V and @ x ( 0) = x, cf* v (r) e U for 
all t > 0. 

52. The point xq is said to be asymptotically stable if there is a neighbor- 
hood U of xq that is such that the integral curve 0 A (t) through an arbitrary 
x e U is defined for t — > +oo and tends to xo as t goes to infinity. Thus, 
with 0(x,t) denoting the flow, 

0(U, s) C 0(U, t) C U for s > t > 0 and 
lim 0 v (f) = xo , for all x e U . 

>+oo ~ 

In the first case orbits that belong to initial configurations close to xq stay in 
the neighborhood of that point, at all later times. In the second case they move to- 
ward the critical point as time increases. Clearly, S2 contains the situation defined 
in SI: a point that is asymptotically stable is also Liapunov stable. 

The following proposition gives more precise information on how rapidly the 
points of the neighborhood U in S2 move towards xq as time increases. 
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Proposition I. Let xo be an equilibrium position of the dynamical system 
(6.1), which is approximated by the linearization (6.10b) in a neighborhood 
of xo- Assume that for all eigenvalues a; ofDF|x 0 we have Refer, •} < — c < 
0. Then there is a neighborhood U of xq such that the flow of F on U (i.e, 
which fulfills 0(11 . t = 0) = U) is defined for all positive times, as well 
as a constant cl such that for all x e U and all t > 0 we have 

\\<P.At)-x 0 \\<de- ct \\x-x 0 \\ . (6.17) 



Here || . . . || denotes the distance function. The result (6.17) tells us that the 
orbit through x e U converges to xo uniformly and at an exponential rate. 

A criterion for instability of an equilibrium position is provided by the follow- 
ing. 



Proposition II. Let xq be an equilibrium position of the dynamical sys- 
tem (6.1). If xq is stable then none of the characteristic exponents (i.e. the 
eigenvalues of D l : \ x of the linearization of (6.1)) has a positive real part. 



We skip the proofs of these propositions and refer, for example, to Hirsch and 
Smale (1974). Instead, we wish to illustrate them by a few examples and to give 
the normal forms of the linearization (6.10) for the case of two dimensions. 

One should note that definitions SI, S2 and propositions I and II apply to 
arbitrary smooth vector fields, and not only to linear systems. In general, the lin- 
earization (6.10) clarifies matters only in the immediate neighborhood of the crit- 
ical point x o . The question of the actual size of the domain around xq from which 
all integral curves converge to xo, in the case of asymptotic stability, remains open, 
except for linear systems. We shall return to this below. 

For a system with one degree of freedom, / = 1, the space on which the 
system (6.1) is defined has dimension 2. In the linearized form (6.10) it is 



with aik = (d F' /dx k )\x 0 - The eigenvalues are obtained from the characteristic 
polynomial det (all — A) = 0, i.e. from the equation 

a“ — a(an + < 222 ) + anfl22 — «12«21 = 0 , 

which may be expressed by means of the trace t = Tr,4 and the determinant 
d — det A as follows: 

a 2 — ta + d = 0. (6.18) 

As is well known, the roots of this equation fulfill the relations 
at + a2 = t , a\ai — d . 
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If the discriminant D — t 2 — Ad is positive or zero, the solutions a\ and to are 
real. In this case we must distinguish the following possibilities. 

(i) a\ < (xi < 0, i.e. d > 0 and —2 ~J~d < t < 0. For diagonal A the solu- 
tions are y 1 = expCai ?)> q, y 2 = expCcnOVg and we obtain the pattern shown in 
Fig. 6.2a. The origin is asymptotically stable: it is a node. 

(ii) cy i = «2 <0. This is a degenerate case contained in (i) and is shown in 
Fig. 6.2b. 

(iii) oti < 0 < c*i, i.e. d < 0. Here the origin is unstable. The orbits show the 
typical pattern of a saddle point; see Fig. 6.2c: some orbits approach the origin, 
others leave it. 






Fig. 6.2a-c.Typical behavior of a system with one degree of freedom in the neighborhood of an 
equilibrium position. In cases (a) and (b) the equilibrium is asymptotically stable. In case (c) it is 
unstable and has the structure of a saddle-point 



If the discriminant D is negative, the characteristic exponents are complex con- 
jugate numbers 



a\ = (7 + IQ , (X2 = o — \Q , 



with a and q real. Here t — 2a , d = a 2 + q 2 . The various cases that are possible 
here are illustrated in Fig. 6.3, which shows the examples of the damped, the ex- 
cited, and the unperturbed oscillator (6.16a). The figure shows the solution with 
initial condition = 1, y^ = 0 of (6.15), viz. 



y l (r) = q( r) = 
y 2 (r) = q( r) = 





e 



-g* 




(6.19) 



Here we have introduced r = cot and g = y /w. Curve A corresponds to g = 0, 
curve B to g — 0.15, and curve C tog = —0.15. In the framework of our analysis 
these examples tell us the following. 

(iv) Curve A. Here a — 0, q = co, so that t = 0 and d > 0. The origin is 
stable but not asymptotically stable. It is said to be a center. 

(v) Curve B. Here a = —y < 0, q = y/ tx> 2 — y 2 , so that t < 0, d > 0. The 
origin is now asymptotically stable. 
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(vi) Curve C. Now a — — y is positive while q is as in (v); therfore t > 0, 
d > 0. The orbits move away from the origin like spirals. The origin is unstable 
for t -* +oo. 



y 2 




Our discussion shows the typical cases that occur. Figure 6.4 illustrates the var- 
ious domains of stability in the plane of the parameters (t, d). The discussion is 
easily completed by making use of the real normal forms of the matrix A and by 
considering all possible cases, including the question of stability or instability as 
t tends to — oo. 




Fig. 6.4. For a system with / = 1 
the various stability regions are deter- 
mined by the trace t and the deter- 
minant d of the linearization A. AS 
means asymptotically stable (the equi- 
librium position is a node). S means 
stable (center) and US means unstable 
(saddle-point) 
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6.2.4 Critical Points of Hamiltonian Vector Fields 

It is instructive to try the stability criteria developed above on canonical systems. 
These are governed by the canonical equations (2.99), 

x — J H x , (6.20) 

where J is defined as in (2.102), its properties being 

detJ = l, J r = J 1 = -J, J 2 = -1 . 

The system (6.20) having an equilibrium position at xo, the Hamiltonian vector 
field X]\ vanishes at that point. As J is regular, also the vector of partial derivatives 
H v vanishes in a'q. Linearizing around xq, i.e. setting y = x — xq and expanding 
the right-hand side of (6.20), we obtain the linear system 



j = Ay 

with A = JB and B = {d 2 H / dx k dx l \x=x 0 ) ■ 

The matrix B is symmetric, B = B 7 . Making use of the properties of J we 
have 

A r J + JA = 0 . (6.21) 

A matrix that obeys condition (6.21) is said to be infinitesimally symplectic. This 
name becomes clear if one considers a symplectic matrix M that differs only a 
little from 11. 

M = 1 + eA + 0(£ 2 ) . 

The defining relation (2.113) then indeed yields (6.21), to first order in e . 

The following result applies to matrices which fulfill condition (6.21). 



Proposition. If a is an eigenvalue of the infinitesimally sympletic matrix 
A, having multiplicity k, then also —a is an eigenvalue of A and has the 
same multiplicity. If a — 0 is an eigenvalue then its multiplicity is even. 



Proof. The proof makes use of the properties of J, of the symmetry of B and of 
well-known properties of determinants. The eigenvalues are the zeros of the char- 
acteristic polynomial P(a) = det (a II — A). Therefore, it is sufficient to show that 
det (all — A) = det (all + A). This is seen as follows 

1 In Sect. 5.5.4 symplectic transformations are defined without reference to coordinates, see defi- 
nition SYT. If these are chosen to be infinitesimal, F = id + e A, then to first order in e relation 
(6.21) is obtained in the coordinate-free form o>(Ae. e' ) + a>(e. Ae') = 0. 
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P(a) — det (cel — A) = det (— aJ 2 — JB) = det J det (— aJ — B) 

= det (— aJ — B) t = det (aJ — B) = det (a II — J _1 B) 

= det (all + JB) = det (a 11 + A) . 

Thus, the proposition is proved. □ 

This result shows that the assumptions of proposition I (Sect. 6.2.3) can never 
be fulfilled for canonical systems. As a consequence, canonical systems cannot 
have asymptotically stable equilibria. Proposition II of Sect. 6.2.3, in turn, can be 
applied to canonical systems: it tells us that the equilibrium can only be stable 
if all characteristic exponents are purely imaginary. As an example consider the 
case of small oscillations about an absolute minimum qo of the potential energy 
described in Practical Example 2.1. Expanding the potential energy about q o up 
to second order in ( q — q o), we obtain equations of motion that are linear. After 
we have transformed to normal coordinates the Hamiltonian function that follows 
from the Lagrangian function (A. 8), Practical Example 2.1, reads 

H =^X>* + ^in- 



setting Q'j = sfTT, Qi and P' = Pi j sfPT,, H takes the form 



H = l -Y j n i {P' i 2 + Q?) . 



i=t 



(6.22) 



We calculate the matrix A = JB from this: A takes the standard form 




(6.23) 



Generally, one can show that the linearization of a Hamiltonian system has pre- 
cisely this standard form if the Hamiltonian function of its linearization is positive- 
definite. (Clearly, in the case of small oscillations about an absolute minimum of 
the potential energy, H does indeed have this property.) Diagonalizing the matrix 
(6.23) shows that the characteristic exponents take the purely imaginary values 



±i^2i , ii.f22, . . . , iif2y . 
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Therefore, a system for which A has the standard form (6.23) does not contra- 
dict the criterion for stability of proposition II of Sect. 6.2.3. The point a'q has a 
chance to be stable although this is not decided by the above propositions. In order 
to proceed one tries to find an auxiliary function V (x) that has the property that 
it vanishes in xo and is positive everywhere in a certain open neighborhood U of 
that point. In the example (6.22) this could be the energy function (with x l = (>', 
x i+ f = p;, i = i,...,/) 

j f 

V(x) = E(x) = -J2 &i[(x i+f ) 2 + (r') 2 ] . 

L i — I 

We then take the time derivative of V ( x ) along orbits of the system. If this deriva- 
tive is negative or zero everywhere in U, this means that no solution moves out- 
ward, away from xq. Then the point aq is stable. 

Remark. An auxiliary function that has these properties is called a Liapunov func- 
tion. 

The test for stability by means of a Liapunov function can also be applied 
to systems that are not canonical, and it may even be sharpened there. Indeed, if 
the derivative of V (x) along solutions is negative everywhere in U, then all or- 
bits move inward, towards xo. Therefore this point is asymptotically stable. Let us 
illustrate this by the example of the oscillator (6.15) with and without damping. 
The point (q = 0, q — 0) is an equilibrium position. A suitable Liapunov function 
is provided by the energy function, 

V(q,q) = f j(4 2 + co 2 q 2 ) with V(0,0) = 0 . (6.24) 

Calculate V along solution curves: 

dV dV 2 2 2 2 

V = —q + —q — co~qq - 2 yq" - co qq = -2 yq . (6.25) 

6q 6q 

In the second step we have replaced q by q and q, using (6.15). For y — 0, V 
vanishes identically. No solution moves outward or inward and therefore (0, 0) is 
stable: it is a center. For positive y, V is strictly negative along all orbits. The 
solutions move inward and therefore (0, 0) is asymptotically stable. 



6.2.5 Stability and Instability of the Free Top 



A particularly beautiful example of a nonlinear system with stable and unstable 
equilibria is provided by the motion of a free, asymmetric rigid body. Follow- 
ing the convention of Sect. 3.13 (3.60), the principal axes are labeled such that 
0 < I\ < Ij < h- We set 



i def — , 

x — a>i and 



p \ def h 



h 



h 



2 3 

X X 



(with cyclic permutations) . 
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The Eulerian equations (3.59) take the form (6.1), viz. 



• i h~ h 23 
x = xx 



h 



x 2 



^3 ^131 .3 

XX , X 

h 



h~ h 

h 



x'x 2 



(6.26) 



Here, we have written the right-hand sides such that all differences /,• — I/,- are pos- 
itive. This dynamical system has three critical points (equilibria) whose stability 
we wish to investigate, viz. 

x^ = (co, 0, 0) , x (2> = (0, co, 0) , .Tq 3) = (0, 0, co) , 



co being an arbitrary positive constant. We set y = x — x {! 1 and linearize equations 
(6.26). For example, in the neighborhood of the point x^ l> we obtain the linear 
system 



y 




/ < ! 0 

0 0 



—co 



h~ h 

h 



0 \ 

h — h 
> 

h 

0 




^ Av 



The characteristic exponents follow from the equation det (a 11 — A) = 0 and are 
found to be 

a[ ly = 0 , = i«y(/ 2 -/,)(/ 3 -/i)// 2/3 ■ (6.27a) 

(2) 

A similar analysis yields the following characteristic exponents at the points x^ 

( 3 ) 

and a'q , respectively: 

a{ 2) = 0 , af = -af = co^J \h ~ h)(h ~ h) / I1I3 , (6.27b) 



a{ 3) = 0 , « 2 3) = -af } = i coy/(I 3 - h)(h - h)/hh ■ (6.27c) 

Note that in the case of (6.27b) one of the characteristic exponents has a positive 

( 2 ) 

real part. Proposition II of Sect. 6.2.3 tells us that x Q cannot be a stable equi- 
librium. This confirms our conjecture of Sect. 3.14 (ii), which we obtained from 
Fig. 3.22: rotations about the axis corresponding to the intermediate moment of 
inertia cannot be stable. 

Regarding the other two equilibria, the characteristic exponents (6.27a) and 
(6.27c) are either zero or purely imaginary. Therefore, x ( ( j l) and Xq 3) have a chance 
of being stable. This assertion is confirmed by means of the following Fiapunov 
functions for the points Xq ] and Xq 3> , respectively: 

V (l \x) = \[I 2 (I 2 - h)(x 2 ) 2 + h(h ~ /|)(x 3 ) 2 ] , 
y ( 3 > (x) = i[/i (/ 3 - h)(x x ) 2 + h(h - /2H* 2 ) 2 ] ■ 
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vanishes at x — x^*; it is positive everywhere in the neighborhood of this 
point. Taking the time derivative of V' 1 1 1 along solutions, we obtain, making use 
of the equations of motion (6.26), 

V m (x) = h(h - h)x 2 x 2 + hih - h)x 3 x 3 

= [(/2 - /i)(/3 - h) - (h - h)(h - h)]x l x 2 x 3 = 0 . 

An analogous result is obtained for V (3> (v). As a consequence the equilibria x 2 1 
(3) 

and A'jj ’ are stable. However, they are not Liapunov stable, in the sense of the 
definition (St3) below. 



6.3 Long-Term Behavior of 

Dynamical Flows and Dependence on External Parameters 

In this section we investigate primarily dissipative systems. The example of the 
damped oscillator (6.15), illustrated by Fig. 6.3 may suggest that the dynamics 
of dissipative systems is simple and not very interesting. This impression is mis- 
leading. The behavior of dissipative systems can be more complex by far than the 
simple “decay” of the motion whereby all orbits approach exponentially an asymp- 
totically stable point. This is the case, for instance, if the system also contains a 
mechanism that, on average, compensates for the energy loss and thus keeps the 
system in motion. Besides points of stability there can be other structures of higher 
dimension that certain subsets of orbits will cling to asymptotically. In approach- 
ing these attractors for t -> +oo, the orbits will lose practically all memory of 
their initial condition, even though the dynamics is strictly deterministic. On the 
other hand, there are systems where orbits on an attractor with neighboring initial 
conditions, for increasing time, move apart exponentially. This happens in dynam- 
ical systems that possess what are called strange attractors. They exhibit the phe- 
nomenon of extreme sensitivity to initial conditions, which is one of the agents 
for deterministic chaos: two orbits whose distance, on average, increases exponen- 
tially, pertain to initial conditions that are indistinguishable from any practical point 
of view. 

For this phenomenon to happen, there must be at least three dynamical vari- 
ables. In point mechanics this means that the phase space must have dimensions 
4, 6, or higher. Obviously, there is a problem in representing the flow of a dynam- 
ical system as a whole because of the large number of dimensions. On the other 
hand, if we deal with finite motions that stay in the neighborhood of a periodic 
orbit, it may be sufficient to study the intersection of the orbits with hypersurfaces 
of smaller dimension, perpendicular to the periodic orbit. This is the concept of 
Poincare mapping. It leads to a discretization of the flow: e.g. one records the flow 
only at discrete times to, to + T, to + 2T, etc., where T is the period of the reference 
orbit, or else, when it hits a given transversal hypersurface. The mapping of an 
hi - dimensional flow on a hypersurface of dimension (m — 1), in general, may give 
a good impression of its topology. There may even be cases where it is sufficient 
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to study a single variable at special, discrete points (e.g., maxima of a function). 
One then obtains a kind of return mapping in one dimension that one may think 
of as a stroboscopic observation of a one-dimensional system. If this mapping is 
suitably chosen it may give hints to the behavior of the flow as a whole. 

In general, dynamical systems depend on one or several parameters that control 
the strength of external influences. An example is provided by forced oscillations, 
the frequency and amplitude of the the exciting oscillation being the control param- 
eters. In varying these parameters one may hit critical values at which the structure 
of the flow changes qualitatively. Critical values of this kind are called bifurcations. 
Bifurcations, too, play an important role in the development of deterministic chaos. 

This section is devoted to a more precise definition of the concepts sketched 
above. They are then discussed and illustrated by means of a number of examples. 

6.3.1 Flows in Phase Space 

Consider a connected domain Uq of initial conditions in phase space that has the 
oriented volume Vo. 

(i) For Hamiltonian systems Liouville’s theorem tells us that the flow <P carries 
this initial set across phase space as if it were a connected part of an incompress- 
ible fluid. Total volume and orientation are preseved; at any time t the image U t of 
Uq under the flow has the same volume V, — Vq. Note, however, that this may be 
effected in rather different ways: for a system with / = 2 (i.e. four-dimensional 
phase space) let Uq be a four-dimensional ball of initial configurations. The flow 
of the Hamiltonian vector field may be such that this ball remains unchanged or is 
deformed only slightly, as it wanders through phase space. At the other extreme, 
it may be such that the flow drives apart, at an exponential rate exp(o:r), points of 
one direction in Uq, while contracting points in a direction perpendicular to the 
first, at a rate exp(— at) so that the total volume is preserved. 2 Liouville’s theorem 
is respected in either situation. In the former case orbits through Uq possess a cer- 
tain stability. In the latter case they are unstable in the sense that there are orbits 
with arbitrarily close initial conditions that nevertheless move apart at an expo- 
nential rate. Even though the system is deterministic, it is practically impossible 
to reconstruct the precise initial condition from an observation at a time t > 0. 

(ii) For dissipative systems the volume Vo of the initial set Uq is not conserved. 
If the system loses energy, the volume will decrease monotonically. This may hap- 
pen in such a way, that the initial domain shrinks more or less uniformly along all 
independent directions in Uq. There is also the possibility, however, that one di- 
rection spreads apart while others shrink at an increased rate such that the volume 
as a whole decreases. 

A measure of constant increase or decrease of volume in phase space is pro- 
vided by the Jacobian determinant of the matrix of partial derivatives D0 (2.119). 

- In systems with / = 1, i.e. with a two-dimensional phase space, and keeping clear from saddle 
point equilibria, the deformation can be no more than linear in time, cf. Exercise 6.3. 
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If this determinant is 1, then Liouville’s theorem applies. If it decreases as a func- 
tion of time, the phase space volume shrinks. Whenever the Jacobian determinant 
is different from zero, the flow is invertible. If it has a zero at a point x of phase 
space, the flow is irreversible at this point. 

A simple phenomenological method of introducing dissipative terms into 
Hamiltonian systems consists in changing the differential equation for p(t) in the 
following manner: 

dH 

Pj = “ dqJ ~ Rj P) ' (6 ' 28 ' 

One calculates the time derivative of H along solutions of the equations of motion, 



d H 
df 



3 H ■ ^ 3 H . tL ■ 



(6.29) 



Depending on the nature of the dissipative terms R,, the energy decreases either 
until the system has come to rest or the flow has reached a submanifold of lower 
dimension than dimP on which the dissipative term q' Rj (q. p) vanishes. 

In the example (6.15) of the damped oscillator we have 
H — ( p 2 /m + mw 2 q 2 )/2 and R — 2 ymq , 



so that 

dH , 2 y 

= —2 ymq~ — — p . (6.30) 

dt m 

In this example the leakage of energy ceases only when the system has come to 
rest, i.e. when it has reached the asymptotically stable critical point (0, 0). 



6.3.2 More General Criteria for Stability 

In the case of dynamical systems whose flow shows the behavior described above, 
the stability criteria of Sect. 6.2.3 must be generalized somewhat. Indeed, an orbit 
that tends to a periodic orbit, for t -> +oo, can do so in different ways. Further- 
more, as this concerns a local property of flows, one might ask whether there are 
subsets of phase space that are preserved by the flow, without “dissolving” for large 
times. The following definitions collect the concepts relevant to this discussion. 

Let F be a complete vector field on K", or phase space , or, more generally, 
on the manifold M, depending on the system one is considering. Let B be a subset 
of M whose points are possible initial conditions for the flow of the differential 
equation (6.1), 



0 l=o (B) = B . 
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For positive or negative times t this subset moves to The image <P t (B) can 

be contained in B, but it may also have drifted out of B, partially or completely. 
We sharpen the first possibility as follows. 

(i) If the image of B under the flow is contained in B, for all t > 0, 

<P,(B) C B , (6.31) 

the set B is said to be positively invariant. 

(ii) Similarly, if the condition (6.31) held in the past, i.e, for all t < 0, B is 
said to be negatively invariant. 

(iii) Finally, B is said to be invariant if its image under the flow is contained 
in B for all t, 

0 t (t) C B for all t . (6.32) 

(iv) If the flow has several, neighboring domains for which (6.32) holds, ob- 
viously, their union has the same property. For this reason one says that B is a 
minimal set if it is closed, nonempty, and invariant in the sense of (6.32), and if 
it cannot be decomposed into subsets that have the same properties. 

A periodic orbit of a flow <P t has the property 0 t +z (m ) — 0, (m). for all points 
m on the orbit, T being the period. Very much like equilibrium positions, closed, 
periodic orbits are generally exceptional in the diversity of integral curves of a 
given dynamical system. Furthermore, equilibrium positions may be understood 
as special, degenerate examples of periodic orbits. For this reason equilibria and 
periodic orbits are called critical elements of the vector field F that defines the dy- 
namical system (6.1). It is not difficult to verify that critical elements are minimal 
sets in the sense of definitions (iii) and (iv) above. 

Orbits that move close to each other, for increasing time, or tend towards each 
other, can do so in different ways. This kind of “moving stability’’ leads us to the 
following definitions. We consider a reference orbit A, say, the orbit of a mass 
point m a . This may, but need not, be a critical element. Let another mass point 
nig move along a neighboring orbit B. At time t = 0 m A starts from m° A and nig 
starts from m° B , their initial distance being smaller than a given 8 > 0, 

\\m° B - m° A \\ < 8 (t = 0) . (6.33) 

These orbits are assumed to be complete (or, at least, to be defined for t > 0), i.e. 
they should exist in the limit t — > ±oo (or, at least, for t -> +oo). Then orbit A 
is stable if 



Stl. for every test orbit B that fulfills (6.33) there is an e > 0 such that 
for t > 0 orbit B, as a whole, never leaves the tube with radius e around 
orbit A (orbital stability ); or 

St2. the distance of the actual position of m b (t ) from orbit A tends to zero 
in the limit t -» +oo (asymptotic stability ); or 

St3. the distance of the actual positions of iha and mg at time t tends to 
zero as t — > +oo (Liapunov stability ). 
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Fig. 6.5. Stability of an orbit (A) for the exam- 
ple of a system in two dimensions, (a) Orbital 
stability, (b) asymptotic stability, (c) Liapunov 
stability 






In Fig. 6.5 we sketch the three types of stability for the example of a dynamical 
system in two dimensions. Clearly, analogous criteria can be applied to the past, 
i.e. the limit t -> — oo. As a special case, m,\ may be taken to be an equilibrium 
position in which case orbit A shrinks to a point. Orbital stability as defined by Stl 
is the weakest form and corresponds to case SI of Sect. 6.2.3. The two remaining 
cases (St2 and St3) are now equivalent and correspond to S2 of Sect. 6.2.3. 

Remarks: Matters become particularly simple for vector fields on two-dimen- 
sional manifolds. We quote the following propositions for this case. 

Proposition I. Let F be a vector field on the compact, connected manifold M 
(with dim M — 2) and let B be a minimal set in the sense of definition (iv) above. 
Then B is either a critical point or a periodic orbit, or else B — M and M has 
the structure of a two-dimensional torus T 2 . 

Proposition II. If, in addition, M is orientable and if the integral curve @ t (m) 
contains no critical points for t > 0, then either <t> t (m) is dense in M (it covers 
all of M), which is T 2 , or is a closed orbit. 

For the proofs we refer to Abraham and Marsden (1981, Sect. 6.1). 

As an example for motion on the torus T 2 , let us consider two uncoupled 
oscillators 



Pi + (o\q\ = 0 , p 2 + cojq 2 = 0 . 



(6.34) 
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Transformation to action and angle coordinates (see Sect. 2.37.2, Example (vi)) 
gives 



q t = s/2 Pi I (at sin 0; , p, = Jlw/Pi cos 0,- , i — 1,2 

with P,- = /, = const. The integration constants h, I 2 are proportional to the 
energies of the individual oscillators, /, = E, /a>i . The complete solutions 

P\ = I\ , P 2 — h , Q\— co\t + (2 1(0) , 02 — 1 + 02(0) (6.35) 

lie on tori T 2 in the four-dimensional phase space R 4 that are fixed by the constants 
I\ and h- If the ratio of frequencies is rational, 

(o 2/021 — ni/n\ , rii e N , 

the combined motion on the torus is periodic, the period being T — 2nni/w\ — 
2Tzri2/(JL>2. If the ratio < 02 /to \ is irrational, there are no closed orbits and the or- 
bits cover the torus densely. (Note, however, as the rationals are dense in the real 
numbers, the orbits of the former case are dense in the latter.) For another and 
nonlinear example we refer to Sect. 6.3.3(ii) below. 

6.3.3 Attractors 

Let F be a complete vector field on M = R" (or on another smooth manifold M, 
for that matter) that defines a dynamic system of the type of (6.1). A subset A of 
M is said to be an attractor of the dynamical system if it is closed and invariant 
(in the sense of definition 6.3.2(iii)) and if it obeys the following conditions. 

(i) A is contained in an open domain Uq of M that is positively invariant itself. 
Thus, according to definition 6.3.2(i), Uq has the property 

<P,(U 0 ) CU 0 for t > 0 . 

(ii) For any neighborhood V of A contained entirely in Uq (i.e. which is such 
that A C V C Uo), one can find a positive time T > 0 beyond which the image 
of Uo by the flow (p, of F is contained in V, 

<P,(U 0 )CV for all f > T . 

The first condition says that there should exist open domains of M that contain the 
attractor and that do not disperse under the action of the flow, for large positive 
times. The second condition says that, asymptotically, integral curves within such 
domains converge to the attractor. In the case of the damped oscillator. Fig. 6.3, 
the origin is a (pointlike) attractor. Here, Uo can be taken to be the whole of R 2 
because any orbit is attracted to the point (0,0) like a spiral. It may happen that M 
contains several attractors (which need not be isolated points) and therefore that 
each individual attractor attracts the flow only in a finite subset of M. For this 
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reason one defines the basin of an attractor to be the union of all neighborhoods 
of A that fulfill the two conditions (i) and (ii). Exercise 6.6 gives a simple example. 

Regarding condition (ii) one may ask the question whether, for fixed Uo, one 
can choose the neighborhood V such that it does not drift out of Uo under the ac- 
tion of the flow for positive times, i.e. whether V, = 0,(V ) C Uo for all t > 0. If 
this latter condition is fulfilled, the attractor, A, is said to be stable. The following 
examples may help to illustrate the concept of attractor in more depth. 

Example (i) Forced oscillations (Van der Pol’s equation). The model (6.15) of 
a pendulum with damping or external excitation is physically meaningful only in 
a small domain, for several reasons. The equation of motion being a linear one, 
it tells us that if q(t) is a solution, so is every q{t) — Xq(t), with X an arbitrary 
real constant. Thus, by this simple rescaling, the amplitude and the velocity can be 
made arbitrarily large. The assumption that friction is proportional to q then can- 
not be a good approximation. On the other hand, if one chooses y to be negative, 
then according to (6.30) the energy that is delivered to the system grows beyond 
all limits. It is clear that either extrapolation - rescaling or arbitrarily large energy 
supply - must be limited by nonlinear dynamical terms. 

In an improved model one will choose the coefficient y to depend on the am- 
plitude in such a way that the oscillation is stabilized: if the amplitude stays below 
a certain critical value, we wish the oscillator to be excited; if it exceeds that value, 
we wish the oscillator to be damped. Thus, if uit) denotes the deviation from the 
state of rest, (6.15) shall be replaced by 



mii(t) + 2my(u)u(t ) + mafuft) = 0 , 


(6.36a) 


where y{u) = f — yo(l — u 2 {t)/ujf) 


(6.36b) 



and where yo > 0. uq is the critical amplitude beyond which the motion is damped. 
For small amplitudes y(u) is negative, i.e. the motion is enhanced. 

We introduce the dimensionless variables 

r = f cot , q(r) = f (J2yo/uo\foo)u{t) 

and set p = q(r). The equation of motion can be written in the form of (6.28), 
where 

H = ^(p 2 + q 2 ) and R{q, p) — -(e - q 2 )p , e = f 2y 0 /co , 
and therefore 
q = P > 

p — ~q + (s - q 2 )p . (6.36c) 

Figure 6.6 shows three solutions of this model for the choice e — 0.4 that are 
obtained by numerically integrating the system (6.36c) (the reader is invited to 
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repeat this calculation). The figure shows clearly that the solutions tend rapidly to 
a limit curve, which is itself a solution of the system. (In Exercise 6.9 one is invited 
to find out empirically at what rate the solutions converge to the attractor.) Point 
A, which starts from the initial condition (qo — —0.25, po — 0), initially moves 
outward and, as times goes on, clings to the attractor from the inside. Points B 
(qo = —0.5, po — 4) and C (qo = —4, po — 0) start outside and tend rapidly to the 
attractor from the outside. In this example the attractor seems to be a closed, and 
hence periodic, orbit. (We can read this from Fig. 6.6 but not what the dimension of 
the attractor is.) Figure 6.7 shows the coordinate q( r) of the point A as a function 
of the time parameter r. After a time interval of about twenty times the inverse 
frequency of the unperturbed oscillator it joins the periodic motion on the attractor. 
On the attractor the time average of the oscillator’s energy E — (p 2 + q 2 )/ 2 is 
conserved. This means that, on average, the driving term proportional to e feeds in 
as much energy into the system as the latter loses through damping. From (6.29) 
we have 



d£/dr = sp 2 — q 2 p 2 . 



Taking the time average, we have d E /dr = 0, and hence 

sp 2 = q 2 p 2 , (6.37) 

the left-hand side being the average energy supply, the right-hand side the average 
loss through friction. 

For e = 0.4 the attractor resembles a circle and the oscillation shown in Fig. 6.7 
is still approximately a harmonic one. If, instead, we choose s appreciably larger, 
the limit curve gets strongly deformed and takes more the shape of a hysteresis 
curve. At the same time, q( r) shows a behavior that deviates strongly from a sine 
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Fig. 6.7. Motion of the point A of Fig. 6.6, Fig. 6.8. Motion of the point A, with initial 

with initial condition ( q = —0.25 , p = 0) for condition (—0.25,0) as in Fig. 6.7 but here 

e = 0.4, as a function of time. It quickly joins with s = 5.0 

the periodic orbit on the attractor 



curve. Figure 6.8 shows the example s = 5.0. The time variation of q(x) shows 
cleary that it must contain at least two different scales. 

Example (ii) Two coupled Van der Pol oscillators. The second example is di- 
rectly related to, and makes use of the results of, the first. We consider two identical 
systems of the type (6.36c) but add to them a linear coupling interaction. In order 
to avoid resonances, we introduce an extra term into the equations that serves the 
purpose of taking the unperturbed frequencies out of tune. Thus, the equations of 
motion read 

qi = Pi , i — 1,2, 

Pi = ~qi + (e - q\)pi + Mqi - q\) , (6.38) 

P 2 = -qi - gq 2 + 0 - qhp 2 + Hqi - q 2 ) ■ 

Here q is the detuning parameter while X describes the coupling. Both are assumed 
to be small. 

For X — q — 0 we obtain the picture of the first example, shown in Fig. 6.6, 
for each variable: two limit curves in two planes of R 4 that are perpendicular to 
each other and whose form is equivalent to a circle. Their direct product defines a 
torus T 2 , embedded in R 4 . This torus being the attractor, orbits in its neighborhood 
converge towards it, at an approximately exponential rate. For small perturbations, 
i.e. 5, 1 « £, one can show that the torus remains stable as an attractor for the 
coupled system (see Guckenheimer and Holmes 2001, Sect. 1.8). Note, however, 
the difference to the Hamiltonian system (6.35). There, for given energies E i, /R, 
the torus is the manifold of motions, i.e. all orbits start and stay on it, for all times. 
Here, the torus is the attractor to which the orbits tend in the limit t -* +oo. The 
manifold of motions is four-dimensional but, as time increases, it “descends” to a 
submanifold of dimension two. 
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6.3.4 The Poincare Mapping 

A particularly clear topological method of studying the flow in the neighborhood 
of a closed orbit is provided by the Poincare mapping to which we now turn. 
In essence, it consists in considering local transverse sections of the flow, rather 
than the flow as a whole, i.e, the intersections of integral curves with some local 
hypersurfaces that are not tangent to them. For example, if the flow lies in the two- 
dimensional space IE 2 , we let it go through local line segments that are chosen in 
such a way that they do not contain any integral curve or parts thereof. One then 
studies the set of points where the integral curves cross these line segments and 
tries to analyze the structure of the flow by means of the pattern that one obtains 
in this way. Figure 6.6 shows a transverse section for a flow in two dimensions 
(dashed line). The set of intersection points of the orbit starting in A and this line 
section shows the average exponential approach to the attractor (see also Exercise 
6 . 10 ). 

A flow in three dimensions is cut locally by planes or other two-dimensional 
smooth surfaces that are chosen such that they do not contain any integral curves. 
An example is shown in Fig. 6.9: at every turn the periodic orbit r crosses the 
transverse section S at the same point, while a neighboring, nonperiodic orbit cuts 
the surface S at a sequence of distinct points. With these examples in mind the 
following general definition will be readily plausible. 



S 




Fig. 6.9. Transverse section for a periodic orbit in 



Definition. Let F be a vector field onM = IE" (or on any other smooth manifold 
of dimension n ). A local transverse section of F at the point x e M is an open 
neighborhood on a hypersurface of dimension dim S = dim M — 1 = n — 1 (i.e. 
a submanifold of M) that contains x and is chosen in such a way that, at none of 
the points s e S does the vector field F(s) lie in the tangent space T s S. 

The last condition makes sure that all flow lines going through points s of S 
do indeed intersect with S and that none of them lies in S. 

Consider a periodic orbit r with period T, and let S be a local transverse 
section at a point xo on F . Without restriction of generality we may take xo(t = 
0) = 0. Clearly, we have xo(nT) — 0, for all integers n. As F does not vanish in 
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xq, there is always a transverse section S that fulfills the conditions of the defi- 
nition above. Let So be a neighborhood of a'o that is contained in S. We ask the 
question at what time r{x) an arbitrary point x e So that follows the flow is taken 
back to the transverse section S for the first time. For x — xo the answer is simply 
r(.vo) = T and &t(x o) = <J>o(-*o) = *o- However, points in the neighborhood of 
xq may return to S later or earlier than T or else may not return to the transverse 
section at all. The initial set of So, after one turn, is mapped onto a neighborhood 
Si, i.e. into the set of points 

Si = {<t>r(x_){x)\x_ e So} . (6.39) 

Note that different points of So need different times for returning to S for the first 
time (if they escape, this time is infinite). Therefore, Sj is not a front of the flow. 
The mapping generated in this way, 

7i : So ->■ Si : x h* 0 T (x)(x) , (6.40) 

is said to be the Poincare mapping. It describes the behavior of the flow, as a 
function of discretized time, on a submanifold S whose dimension is one less 
than the dimension of the manifold M on which the dynamical system is defined. 
Figure 6. 10 shows a two-dimensional transverse section for a flow on M = R 3 . 




Fig. 6.10. Poincare mapping of an initial domain S'o in the 
neighborhood of a periodic orbit r. The point yo> where r 
hits the transverse section, is a fixed point of the mapping 



Of course, the mapping (6.40) can be iterated by asking for the image Si of 
Si, after the next turn of all its points, etc. One obtains a sequence of open neigh- 
borhoods 

•S'o Si S 2 ■ ■ ■ -*■ S„ , 
n n 

which may disperse, as time goes to + 00 , or may stay more or less constant, or 
may shrink to the periodic orbit F . This provides us with a useful criterion for 
the investigation of the flow’s long-term behavior in the neighborhood of a peri- 
odic orbit, or, more generally, in the neighborhood of an attractor. In particular, the 
Poincare mapping allows for a test of stability of a periodic orbit or of an attractor. 

In order to answer the question of stability in the neighborhood of the periodic 
orbit r, it suffices to linearize the Poincare mapping at the point a'q. Thus, one 
considers the mapping 

D77(0) = {dn‘/dx k \x=o} . 



(6.41) 
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(In the case of a general manifold M this is the tangent map T 77 at xq e M.) 
The eigenvalues of the matrix (6.41) are called characteristic multipliers of the 
vector field F at the periodic orbit /’. They tell us whether there is stability or 
instability in a neighborhood of the closed orbit r . We have the following. Let r 
be a closed orbit of the dynamical system F and let 77 be a Poincare mapping in 
xq = 0. If all characteristic multipliers lie strictly inside the unit circle, the flow 
will tend to r smoothly as t —*■ +oo. This orbit is asymptotically stable. In turn, 
if the absolute value of one of the characteristic exponents exceeds one, the closed 
orbit r is unstable. 

We study two examples. The first concerns flows in the plane for which trans- 
verse sections are one-dimensional. The second illustrates flows on the torus T 2 
in R 4 , or in its neighborhood, in which case the transverse section may be taken 
to be a subset of planes that cut the torus. 

Example (i) Consider the dynamical system 

Xl = px 1 — X2 ~ (x\ + XtY'x 1 , 

X 2 — /xx 2 + xi — (Xj + xf)"x 2 , (6.42a) 

where the exponent n takes the values n — 1 , 2, or 3 and where p is a real param- 
eter. Without the coupling terms (— x' 2 ) in the first equation and xi in the second, 
the system (6.42a) is invariant under rotations in the (xi, X 2 )-plane. On the other 
hand, without the nonlinearity and with p = 0, we have the system X| = — X 2 , 
X 2 = xi whose solutions move uniformly about the origin, along concentric circles. 
One absorbs this uniform rotation by introducing polar coordinates xi = r cos </>, 
X 2 = r sin (j>. The system (6.42a) becomes the decoupled system 

r = pr-r 2n+x = ~ — U(r , 0) , 
dr 

0=1 = —?-U(r,4>). (6.42b) 

d@ 

The right-hand side of the first equation (6.42b) is a gradient flow (i.e. one whose 
vector field is a gradient field, cf. Exercise 6.7), with 

U(r, cf) = - l -pr 2 + — [ —r 2n+1 - 0 . (6.43) 

2 2n + 2 

The origin r — 0 is a critical point. Orbits in its neighborhood behave like spirals 
around (0,0) with radial dependence r — exp (pt). Thus, for p < 0, the point (0,0) 
is asymptotically stable. For p > 0 this point is unstable. At the same time, there 
appears a periodic solution 

xi = R(p) cos t , X 2 — R(p) sin/ with R(p) = 2 fJ~p , 

which turns out to be an asymptotically stable attractor: solutions starting outside 
the circle with radius R(p) move around it like spirals and tend exponentially to- 
wards the circle, for increasing time; likewise, solutions starting inside the circle 
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move outward like spirals and tend to the circle from the inside. (The reader is 
invited to sketch the flow for /x > 0.) 

In this example it is not difficult to construct a Poincare mapping explicitly. It is 
sufficient to cut the flow in the (xi, X 2 )-plane with the semi-axis (p — (po = const. 

Starting from (x°, x®) on this line, with ro = J (x®) 2 + (x!?) 2 , t ^ le fl° w hi ts t ^ le 
line again after the time t — 2n . The image of the starting point has the distance 
r\ — /7(ro), where r\ is obtained from the first equation (6.42b). Indeed, if <// 
denotes the flow of that equation, r\ = l / / , = 2 7r (r<)). 

Let us take the special case n = 1 and // > 0. Taking the time variable r = /it, 
the system (6.42b) becomes 



dr 

dr 





d<p _ 1 
dr /x 



(6.44) 



With r(r) = 1 /Jq(t) we obtain the differential equation dg>/dr = 2(1 //x — q), 
which can be integrated analytically. One finds q(c, r) = l/;ix + c exp(— 2r), c 
being an integration constant determined from the initial condition q( r = 0) = 
p 0 = 1 /tq . Thus, the integral curve of (6.44) starting from (ro, </>o) reads 

@1 (ro, (po) — ( 1 / Ve(c, r), (po + r/n mod (6.45) 



with c = 1/rg — l//x. Hence, the Poincare mapping that takes (ro, (po) to (ri, (pi — 
(po) is given by (6.45) with x =2n, viz. 



n (ro) = 





(6.46) 



This has the fixed point ro = which represents the periodic orbit. Linearizing 
in the neighborhood of this fixed point we find 



D/7 (ro = Vm ) 



Ail 

dr 0 



= e 



—An 



'0 =V/r 



The characteristic multiplier is X — exp(— 47 r). Its absolute value is smaller than 
1 and hence the periodic orbit is an asymptotically stable attractor. 

Example (ii) Consider the flow of an autonomous Hamiltonian system with 
/ = 2 for which there are two integrals of the motion. Suppose we have already 
found a canonical transformation to action and angle coordinates, i.e. one by which 
both coordinates are made cyclic, i.e. 

[ qi , qi, Pi , Pi, H} -> {0i, 02, h, h, , (6.47) 

and H — co\I\ + oji h ■ An example is provided by the decoupled oscillators (6.34). 
As both dk are cyclic, we have 
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ii(q,p) = 0, or Ii{q, p) — const = Ii(q 0 , Po) 

along any orbit. Returning to the old coordinates for the moment, this means that 
the Poisson brackets 

[H. h) and {/,,/;•} (i, 7 = 1,2) (6.48) 

vanish. 3 In the new coordinates we have 

0j — dH /dli — a>j or 0i(t) = cojt + 0° . (6.49) 

From (6.49) we see that the manifold of motions is the torus T 2 , embedded in the 
four-dimensional phase space. For the transversal section of the Poincare mapping 
it is natural to choose a part S of a plane that cuts the torus and is perpendicular 
to it. Let 6 1 (t) be the angular variable running along the torus, and (hit) the one 
running along a cross section of the torus. A point s e Sq C S returns to S for 
the first time after T = (In /o >\ ). Without loss of generality we measure time in 
units of this period T, x = t/T, and take 0° = 0. Then we have 

d\ (r) — 2ttt , 02 (t) = 27rr&>2/&>i + 0° • (6.49') 

Call C the curve of intersection of the torus and the transverse section of S. The 
Poincare mapping maps points of C on the same curve. The points of intersection 
of the orbit (6. 49') with S appear, one after the other, at r = 0, 1, 2, ... If the ratio 
of frequencies is rational , &> 2 /&>i = m/n, the first (n — 1) images of the point 
02 = 0® are distinct points on C, while the nth image coincides with the starting 
point. If, in turn, the ratio &> 2 /&>i is irrational, a point so on C is shifted, at each 
iteration of the Poincare mapping, by the azimuth 2jtol>2/coi. It never returns to 
its starting position. For large times the curve C is covered discontinuously but 
densely. 

6.3.5 Bifurcations of Flows at Critical Points 

In Example (i) of the previous section the flow is very different for positive and 
negative values of the control parameter. For q < 0 the origin is the only critical 
element. It turns out to be an asymptotically stable equilibrium. For q > 0 the 
flow has the critical elements {0, 0} and {R(q) cos t, R(q) sin r}. The former is an 
unstable equilibrium position, the latter a periodic orbit that is an asymptotically 
stable attractor. If we let q vary from negative to positive values, then, at q = 0, 
a stable, periodic orbit branches off from the previously stable equilibrium point 
{0, 0}. At the same time, the equilibrium position becomes unstable as shown in 
Fig. 6.11. Another way of expressing the same result is to say that the origin acts 
like a sink for the flow at q < 0. For q > 0 it acts like a source of the flow, 
while the periodic orbit with radius R(q) is a sink. The structural change of the 
flow happens at the point (q — 0, r — 0), in the case of this specific example. A 
point of this nature is said to be a bifurcation point. 

' // . / 1 , and h are in involution, for definitions cf. Sect. 2.37.2 
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Fig. 6.11. For the system (6.42a) the 
point r = 0 is asymptotically stable 
for /x < 0. At /i = 0 a periodic solu- 
tion (circle with radius R(/x)) splits 
off and becomes an asymptotically 
stable attractor. At the same time the 
point r = 0 becomes unstable 



The general case is that of the dynamical system 

x = F(n, x) . (6.50) 

whose vector field depends on a set /x = {// 1 , H 2 , ■ ■ ■ , F k 1 of k control parameters. 
The critical points tq(/x) of the system (6.50) are obtained from the equation 

F(/i, xo) — 0 . (6.51) 

The solutions of this implicit equation, in general, depend on the values of the 
parameters //. They are smooth functions of // if and only if the determinant of 
the matrix of partial derivatives D F = {<) F ! (/i, x)/dx k } does not vanish in a'q. 
This is a consequence of the theorem on implicit functions, which guarantees that 
(6.51) can be solved for tq, provided that the condition is fulfilled. The points 
(/i. vo) where this condition is not fulfilled, i.e. where DF has at least one van- 
ishing eigenvalue, need special consideration. Here, several branches of differing 
stability may merge or split off from each other. By crossing this point, the flow 
changes its structure in a qualitative manner. Therefore, a point (/i, To) where the 
determinant of DF vanishes, or, equivalently, where at least one of its eigenvalues 
vanishes, is said to be a bifurcation point. 

The general discussion of the solutions of (6.51) and the complete classifica- 
tion of bifurcations is beyond the scope of this book. A good account of what is 
known about this is given by Guckenheimer and Holmes (2001). We restrict our 
discussion to bifurcations of codimension l. 4 Thus, the vector field depends on 
only one parameter /i, but is still a function of the u-dimensional variable x. If 
(/ro, * 0 ) is a bifurcation point, the following two forms of the matrix of partial 
derivatives DF are typical (cf. Guckenheimer and Holmes 2001): 

DF(/x,-5)Uo,To = (0 (6 ' 52) 

where Aisa(n— l)x(^z — 1) matrix, as well as 



4 



The codimension of a bifurcation is defined to be the smallest dimension of a parameter space 
{li \, . . . , /ijc) for which this bifurcation does occur. 
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/0 -w 0\ 

Df(M,x)| w ,i 0 = I oj 0 o] , (6.53) 

\0 0 B / 

with B a (n — 2) x (n — 2) matrix. 

In the first case (6.52) D/ has one eigenvalue equal to zero, which is respon- 
sible for the bifurcation. As the remainder, i.e. the matrix A, does not matter, we 
can take the dimension of the matrix D/ to be n — 1, in the case of (6.52). Fur- 
thermore, without restriction of generality, the variable x and the control parameter 
ix can be shifted in such a way that the bifurcation point that we are considering 
occurs at (/! = 0, A'o = 0). Then the following types of bifurcations are contained 
in the general form (6.52). 

(i) The saddle-node bifurcation : 

x — ix — x 2 . (6.54) 

For /x > 0 the branch xo = Jfx is the set of stable equilibria and xo = —«/Ji the set 
of unstable equilibria, as shown in Fig. 6.12. These two branches merge at /x — 0 
and compensate each other because, for fx < 0, there is no equilibrium position. 

(ii) The transcritical bifurcation: 

x — ixx — x 2 . (6.55) 

Here the straight lines xq — 0 and aq = /i are equilibrium positions. For /i < 0 
the former is asymptotically stable and the latter is unstable. For fx > 0, on the 
other hand, the former is unstable and the latter is asymptotically stable, as shown 
in Fig. 6.13. The four branches coincide for /x — 0, the semi-axes (aq = 0, or 
A'o = /r, /x < 0) and (ao = H, or xo = 0, fx > 0) exchange their character of 
stability; hence the name of the bifurcation. 




Fig. 6.12. Illustration of a saddle-node bifurca- 
tion at *0 = /x = 0. As in Fig. 6.1 1 the arrows 
indicate the direction of the flow in the neigh- 
borhood of the equilibria 



Fig. 6.13. The transcritical bifurcation. In cross- 
ing the point of bifurcation /z = 0, the two 
semi-axes exchange their character of stability 
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(iii) The pitchfork bifurcation : 

x — t ix — x 3 . (6.56) 

All the points of the straight line xq = 0 are critical points. These are asymptoti- 
cally stable if p is negative, but become unstable if p is positive. In addition, for 
p > 0, the points on the parabola Xq — p are asymptotically stable equilibria, 
as shown in Fig. 6.14. At p — 0, the single line of stability on the left of the 
figure splits into the “pitchfork” of stability (the parabola) and the semi-axis of 
instability. 




In all examples and prototypes considered above the signs of the nonlinear 
terms are chosen such that they act against the constant or linear terms for p > 0, 
i.e. in such a way that they have a stabilizing effect as one moves from the line 
xq = 0 to positive x. The bifurcations obtained in this way are called supercrit- 
ical. It is instructive to study the bifurcation pattern (6.54-6.56) for the case of 
the opposite sign of the nonlinear terms. The reader is invited to sketch the re- 
sulting bifurcation diagrams. The so-obtained bifurcations are called subcritical. 
In the case of the second normal form (6.53) the system must have at least two 
dimensions and D/ must have (at least) two complex conjugate eigenvalues. The 
prototype for this case is the following. 

(iv) The Hopf bifurcation: 

X{ — px 1 — X2 — (x? + Xf)X\ , 

2 2 ( 6 - 57 ) 

X2 — PX 2 + X| — (Xj + X-,)X2 ■ 

This is the same as the example (6.42a), with n = 1. We can take over the results 
from there and draw them directly in the bifurcation diagram ( p , .xq). This yields 
the picture shown in Fig. 6.15. (Here, again, it is instructive to change the sign of 
the nonlinear term in (6.57), turning the supercritical bifurcation into a subcritical 
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Fig. 6.15. The Hopf bifurca- 
tion in two dimensions. The 
lower part of the figure shows 
the behavior of the flow in the 
neighborhood of the asymptoti- 
cally stable equilibrium and of 
the asymptotically stable peri- 
odic solution 



one. The reader should sketch the bifurcation diagram.) We add the remark that 
here and in (6.53) the determinant of D/ does not vanish at (no, xq). It does so, 
however, once we have taken out the uniform rotation of the example (6.42a). One 
then obtains the system (6.42b) for which the determinant of D/ does vanish and 
whose first equation (for n = 1) has precisely the form (6.56). Figure 6.15 may be 
thought of as being generated from the pitchfork diagram of Fig. 6.14 by a rotation 
in the second x -dimension. 

6.3.6 Bifurcations of Periodic Orbits 

We conclude this section with a few remarks on the stability of closed orbits, as 
a function of control parameters. Section 6.3.5 was devoted exclusively to the bi- 
furcation of points of equilibrium. Like the closed orbits, these points belong to 
the critical elements of the vector field. Some of the results obtained there can 
be translated directly to the behavior of periodic orbits at bifurcation points, by 
means of the Poincare mapping (6.40) and its linearization (6.41). 

A qualitatively new feature, which is important for what follows, is the bi- 
furcation of a periodic orbit leading to period doubling. It may be described as 
follows. Stability or instability of flows in the neighborhood of closed orbits is 
controlled by the matrix (6.41), that is the linearization of the Poincare mapping. 
The specific bifurcation in which we are interested here occurs whenever one of the 
characteristic multipliers (i.e. the eigenvalues of (6.41)) crosses the value —1, as a 
function of the control parameter //. Let so be the point of intersection of the peri- 
odic orbit r with a transverse section. Clearly, .vo is a fixed point of the Poincare 
mapping, 77 (so) — sq. As long as all eigenvalues of the matrix D/7(so) (6.41) 
are inside the unit circle (i.e, have absolute values smaller than 1), the distance 
from so °f another point s in the neighborhood of so will decrease monotonically 
by successive iterations of the Poincare mapping. Indeed, in linear approximation 
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we have 

77" (s) - so = (D/7 (so))" (s -so) . (6.58) 

Suppose the matrix D/7 (so) to be diagonal. We assume the first eigenvalue to 
be the one that, as a function of the control parameter /x, moves outward from 
somewhere inside the unit circle, by crossing the value —1 at some value of /x. 
All other eigenvalues, for simplicity, are supposed to stay inside the unit circle. 
In this special situation it is sufficient to consider the Poincare mapping only in 
the 1-direction on the transverse section, i.e. in the direction the eigenvalue 
refers to. Call the coordinate in that direction it. If we suppose that ki(/x) is 
real and, initially, lies between 0 and —1, the orbit that hits the transverse sec- 
tion at the point ,V| of Fig. 6.16a appears in U 2 after one turn, in w$ after two 
turns, etc. It approaches the point .so asymptotically and the periodic orbit through 
so is seen to be stable. If, on the other hand, ki(/x) < -1, the orbit through 
sj moves outward rapidly and the periodic orbit through so, obviously, is unsta- 
ble. 





Fig. 6.16a, b. Poincare mapping in the neighborhood of a periodic orbit, for the case where a char- 
acteristic multiplier approaches the value —1 from above (a), and for the case where it equals that 
value (b) 



A limiting situation occurs if there is a value /xo of the control parameter for 
which A i (/xo) = — 1. Here we obtain the pattern shown in Fig. 6.16b: after one 
turn the orbit through si appears in U 2 = —mi, after the second turn in M3 = +u\, 
then in M4 = —mi, then in n 5 = +u 1, etc. This applies to each s on the u axis, 
in a neighborhood of so- As a result, the periodic orbit r through so has only 
a sort of saddle-point stability: orbits in directions other than the M-axis are at- 
tracted towards it, but orbits whose intersections with the transverse section lie 
on the M-axis will be caused to move away by even a small perturbation. Thus, 
superficially, the point (/xo, so) seems to be a point of bifurcation having the char- 
acter of the pitchfork of Fig. 6.14. A closer look shows, however, that there is 
really a new phenomenon. In a system as described by the bifurcation diagram 
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of Fig. 6.14, the integral curve tends either to the point .ro = +.^/Z, or to the 
point xo — —yfjZ, for positive //. In the case shown in Fig. 6.16b, on the con- 
trary, the orbit alternates between u \ and —u\. In other words, it is a periodic 
orbit /2 with the period 73 — 2 T, T being the period of the original reference 
orbit. 

If we take / 3 as the new reference orbit, the Poincare mapping must be re- 
defined in such a way that 7~2 hits the transverse section for the first time after 
the time 72. One can then study the stability of orbits in the neighborhood of 7~2- 
By varying the control parameter further, it may happen that the phenomenon of 
period doubling described above happens once more at, say, / 1 — ji \ , and that 
we have to repeat the analysis given above. In fact, there can be a sequence of 
bifurcation points (/ xo,so ), etc. at each of which the period is doubled. 

We return to this phenomenon in the next section. 



6.4 Deterministic Chaos 

This section deals with a particularly impressive and characteristic example for de- 
terministic motion whose long-term behavior shows alternating regimes of chaotic 
and ordered structure and from which some surprising empirical regularities can 
be extracted. Although the example leaves the domain of mechanics proper, it 
seems to be so typical, from all that we know, that it may serve as an illustra- 
tion for chaotic behavior even in perturbed Hamiltonian systems. We discuss the 
concept of iterative mapping in one dimension. We then give a first and somewhat 
provisional definition of chaotic motion and close with the example of the logis- 
tic equation. The more quantitative aspects of deterministic chaos are deferred to 
Sect. 6.5. 

6.4.1 Iterative Mappings in One Dimension 

In Sect. 6.3.6 we made use of the Poincare mapping of a three-dimensional flow 
for investigating the stability of a closed orbit as a function of the control para- 
meter /x. We found, in the simplest case, that the phenomenon of period doubling 
could be identified in the behavior of a single dimension of the flow, provided one 
concentrates on the direction for which the characteristic multiplier ( /i ) crosses 
the value — 1 at some critical value of /i. The full dimension of the flow of F(/x, x) 
did not matter. We can draw two lessons from this. Firstly, it may be sufficient 
to choose a single direction within the transverse section S (more generally, a 
one-dimensional submanifold of S) and to study the Poincare mapping along this 
direction only. The picture that one obtains on this one-dimensional submanifold 
may already give a good impression of the flow’s behavior in the large. Secondly, 
the restriction of the Poincare mapping to one dimension reduces the analysis of 
the complete flow and of its full, higher-dimensional complexity, to the analysis 
of an iterative mapping in one dimension. 




6.4 Deterministic Chaos 



391 



Hi -* Ui + 1 = /(Mi) • (6.59) 

In the example of Sect. 6.3.6, for instance, this iterative mapping is the sequence 
of positions of a point on the transverse section at times 0, T, 27’. 3 T, . . . Here the 
behavior of the full system (6.1) at a point of bifurcation is reduced to a difference 
equation of the type (6.59). 

There is another reason one-dimensional systems of the form (6.59) are of 
interest. Strongly dissipative systems usually possess asymptotically stable equi- 
libria and/or attractors. In this case a set of initial configurations filling a given 
volume of phase space will be strongly quenched, by the action of the flow and 
as time goes by, so that the Poincare mapping quickly leads to structures that look 
like pieces of straight lines or arcs of curves. This observation may be illustrated 
by the example (6.38). Although the flow of this system is four dimensional, it 
converges to the torus T 2 , the attractor, at an exponential rate. Therefore, con- 
sidering the Poincare mapping for large times, we see that the transverse section 
of the torus will show all points of intersection lying on a circle. This is also 
true if the torus is a strange attractor. In this case the Poincare mapping shows a 
chaotic regime in a small strip in the neighborhood of the circle (see e.g. Berge, 
Pomeau, Vidal 1987). Finally, iterative equations of the type (6.59) describe spe- 
cific dynamical systems of their own that are formulated by means of difference 
equations (see e.g. Devaney 1989, Collet and Eckmann 1990). In Sect. 6.4.3 be- 
low we study a classic example of a discrete dynamical system (6.59). It belongs 
to the class of iterative mappings on the unit interval, which are defined as fol- 
lows. 

Let fil-i, x) be a function of the control parameter // and of a real variable x 
in the interval [0, 1], / is continuous, and in general also differentiable, and the 
range of p. is chosen such that the iterative mapping 

Xi+\ — f(fi,xi), X e [0, 1] , (6.60) 

does not lead out of the interval [0, 1], An equation of this type can be ana- 
lyzed graphically, and particularly clearly, by comparing the graph of the function 
y(^) = fill, x) with the straight line z(x) — x. The starting point x\ has the im- 
age y{x i), which is then translated to the straight line as shown in Fig. 6.17a. This 
yields the next value xi, whose image y(x 2 ) is again translated to the straight line, 
yielding the next iteration xp, and so on. Depending on the shape of / ( /x , x) and 
on the starting value, this iterative procedure may converge rapidly to the fixed 
point x shown in Fig. 6.17a. At this point the straight line and the graph of / 
intersect and we have 



x = f(n,x). (6.61) 

The iteration x\ — >• xj x converges if the absolute value of the derivative 

of the curve y — f ( /x , x) in the point y = x is smaller than 1. In this case x is 
an equilibrium position of the dynamical system (6.60), which is asymptotically 
stable. If the modulus of the derivative exceeds 1, on the other hand, the point x 
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Fig. 6.17a, b. The iteration x;+i = fQi,Xi) converges to x, provided \df/dx\x < 1 (a). In (b) both 
0 and X 2 are stable but x\ is unstable 



is unstable. In the example shown in Fig. 6.17a x — 0 is unstable. Figure 6.17b 
shows an example where xq = 0 and X 2 are stable, while xi is unstable. By the 
iteration (6.60) initial values x\ < x\ tend to xo, while those with x\ < x\ < 1 
tend to X 2 - 

The nature and the position of the equilibria are determined by the control pa- 
rameter /i. If we let jjL vary within its allowed interval of variation, we may cross 
certain critical values at which the stability of points of convergence changes and, 
hence, where the structure of the dynamical system changes in an essential and 
qualitative way. In particular, there can be bifurcations of the type described in 
Sects. 6.3.5 and 6.3.6. We do not pursue the general discussion of iterated map- 
pings (6.60) here and refer to the excellent monographs by Collet and Eckmann 
(1990) and Guckenheimer and Holmes (1990). An instructive example is given in 
Sect. 6.4.3 below. Also we strongly recommend working out the PC-assisted exam- 
ples of Exercises 6.12-14, which provide good illustrations for iterative mappings 
and give an initial feeling for chaotic regimes. 

6.4.2 Qualitative Definitions of Deterministic Chaos 

Chaos and chaotic motion are intuitive concepts that are not easy to define in a 
quantitative and measurable manner. An example taken from daily life may il- 
lustrate the problem. Imagine a disk-shaped square in front of the main railway 
station of a large city, say somewhere in southern Europe, during rush hour. At the 
edges of the square busses are coming and going, dropping passengers and waiting 
for new passengers who commute with the many trains entering and leaving the 
station. Looking onto the square from the top, the motion of people in the crowd 
will seem to us nearly or completely chaotic. And yet we know that every single 
passenger follows a well-defined path: he gets off the train on platform 17 and 
makes his way through the crowd to a target well known to him, say bus no. 42, 
at the outer edge of the square. 
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Now image the same square on a holiday, on the day of a popular annual 
fair. People are coming from all sides, wandering between the stands, going here 
and there rather erratically and without any special purpose. Again, looking at the 
square from the top, the motion of people in the crowd will seem chaotic, at least 
to our intuitive conception. Clearly, in the second case, motion is more accidental 
and less ordered than in the first. There is more chaoticity in the second situation 
than in the first. The question arises whether this difference can be made quanti- 
tative. Can one indicate measurements that answer quantitatively whether a given 
type of motion is really unordered or whether it has an intrinsic pattern one did 
not recognize immediately? 5 

We give here two provisional definitions of chaos but return to a more quan- 
titative one in Sect. 6.5 below. Both of them, in essence, define a motion to be 
chaotic whenever it cannot be predicted, in any practical sense, from earlier con- 
figurations of the same dynamical system. In other terms, even though the motion 
is strictly deterministic, predicting a state of motion from an initial configuration 
may require knowledge of the latter to a precision that is far beyond any practical 
possibility. 

(i) The first definition makes use of Fourier analysis of a sequence of val- 
ues {x\, X 2 , ■ ■ ■ , x n } , which are taken on at the discrete times t T — r • A, r — 
1,2 Fourier transformation assigns to this sequence another sequence of 
complex numbers [x\, X 2 , • ■ ■ , x n } by 



~ def 

*<7 = 




n 

J2xre~ i27TcrT/n 

r=l 



a — 1 , 2 ,...,« . 



(6.62) 



While the former is defined over the time variable, the latter is defined over a fre- 
quency variable, as will be clear from the following. The sequence {x , } is recorded 
during the total time 



T =t n = nA , 



or, if we measure time in units of the interval A, T — n. The sequence [x T ] may 
be understood as a discretized function x(t) such that x T = x(r) (with time in 
units of A). Then F — 2 n/n is the frequency corresponding to time 7, and the 
sequence {x a ) is the discretization of a function x of the frequency variable with 
x a — x(a ■ F). Thus, time and frequency are conjugate variables. 

Although the {x r } are real, the x„ of (6.62) are complex numbers. However, 
they fulfill the relations x n - a — x* and thus do not contain additional degrees of 
freedom. One has the relation 

i> 2 =i> i 2 

T=1 <7=1 

5 In early Greek cosmology chaos meant “the primeval emptiness of the universe” or, alternatively, 
“the darkness of the underworld”. The modem meaning is derived from Ovid, who defined chaos 
as “the original disordered and formless mass from which the maker of the Cosmos produced 
the ordered universe” ( The New Encyclopedia Britannica ). Note that the loan-word gas is derived 
from the word chaos. It was introduced by J.B. von Helmont, a 17th-century chemist in Brussels. 
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and the inverse transformation reads 6 



X x 




n 



Y.XoZ' 271 ™ 1 ' 1 



(6.63) 



The following correlation function is a good measure of the predictability of 
a signal at a later time, from its present value: 



def 1 
gx = - 



n 

} ' x o x o+X - 
<7=1 



(6.64) 



gx is a function of time, gx = gx(X - A). If this function tends to zero, for increas- 
ing time, this means that any correlation to the system’s past gets lost. The system 
ceases to be predictable and thus enters a regime of irregular motion. 

One can prove the following properties of the correlation function gx. It has 
the same periodicity as x T , i.e. gx+n = gx- It is related to the real quantities \x a \ 2 
by the formula 




E 



|xcr| 2 cos(27T a X / n) , 



X = 1,2 . 



(6.65) 



Hence, it is the Fourier transform of | | 2 . Equation (6.65) can be inverted to give 



n 

g<y = \ x o | 2 = ^ gx cos( 27 r a X/ n) . 
x=i 



(6.66) 



The graph of g a as a function of frequency gives direct information on the sequence 
{.r T }, i.e. on the signal x(t). For instance, if {x T } was generated by a stroboscopic 
measurement of a singly periodic motion, then g a shows a sharp peak at the corre- 
sponding frequency. Similarly, if the signal has a quasiperiodic structure, the graph 
of g a contains a series of sharp frequencies, i.e. peaks of various strengths. Exam- 
ples are given, for instance, by Berge, Pomeau, and Vidal (1987). If, on the other 
hand, the signal is totally aperiodic, the graph of gx will exhibit a practically con- 
tinuous spectrum. When inserted in the correlation function (6.65) this means that 
gx will go to zero for large times. In this case the long-term behavior of the system 
becomes practically unpredictable. Therefore, the correlation function (6.65), or its 
Fourier transform (6.66), provides a criterion for the appearance of chaotic behav- 
ior: if gx tends to zero, after a finite time, or, equivalently, if gx has a continuous 
domain, one should expect to find irregular, chaotic motion of the system. 

6 In proving this formula one makes use of the "orthogonality relation” 

- f ei2 * ma/n = ^0 , 

U a— l 

m = 0, 1, — 1 . 

(see also Exercise 6.15) 
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(ii) The second definition, which is closer to the continuous systems (6.1), starts 
from the strange or hyperbolic attractors. A detailed description of this class of 
attractors is beyond the scope of this book (see, however, Devaney 1989, Berge, 
Pomeau, Vidal 1987, and Exercise 6.14), and we must restrict our discussion to a 
few qualitative remarks. One of the striking properties of strange attractors is that 
they can sustain orbits that, on average, move apart exponentially (without escaping 
to infinity and, of course, without intersecting). In 1971 Newhouse, Takens, and 
Ruelle made the important discovery that flows in three dimensions can exhibit this 
kind of attractors 7 . Very qualitatively this may be grasped from Fig. 6.18, which 
shows a flow that strongly contracts in one direction but disperses strongly in the 
other direction. This flow has a kind of hyperbolic behavior. On the plane where the 
flow lines drive apart, orbits show extreme sensitivity to initial conditions. By fold- 
ing this picture and closing it with itself one obtains a strange attractor on which 
orbits wind around each other (without intersecting) and move apart exponentially 8 . 




Whenever there is extreme sensitivity to initial conditions, the long-term be- 
havior of dynamical systems becomes unpredictable, from a practical viewpoint, so 
that the motion appears to be irregular. Indeed, numerical studies show that there 
is deterministically chaotic behavior on strange attractors. This provides us with 
another plausible definition of chaos: flows of deterministic dynamical systems 
will exhibit chaotic regimes when orbits diverge strongly and, as a consequence, 
practically “forget” their initial configurations. 



7 Earlier it was held that chaotic motion would occur only in systems with very many degrees of 
freedom, such as gases in macroscopic vessels. 

8 See R.S. Shaw: “Strange attractors, chaotic behavior and information flow", Z. Naturforschung 
A36, (1981) 80. 
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6.4.3 An Example: The Logistic Equation 9 

An example of a dynamical system of the type (6.60) is provided by the logistic 
equation 



Xi + 1 = fiXi( 1 - Xj) = fin, Xi) (6.67) 

with x e [0, 1] and l < fi < 4. This seemingly simple system exhibits an ex- 
tremely rich structure if it is studied as a function of the control parameter /x. Its 
structure is typical for systems of this kind and reveals several surprising and uni- 
versal regularities. We illustrate this by means of numerical results for the iteration 
(6.67), as a function of the control parameter in the interval given above. It turns out 
that this model clearly exhibits all the phenomena described so far: bifurcations of 
equilibrium positions, period doubling, regimes of chaotic behavior, and attractors. 

We analyze the model (6.67) as described in Sect. 6.4.1. The derivative of 
/ ( /x , x ), taken at the intersection x = (/i — 1 )//x with the straight line y = x, is 
f'ili, x) = 2 — fi. In order to keep \f\ initially smaller than 1, one must take 
fi > 1. On the other hand, the iteration (6.67) should not leave the interval [0, 1], 
Hence, // must be chosen smaller than or equal to 4. 

In the interval 1 < /x < 3, \f'\ < 1. Therefore, the point of intersection 
x — (/x — l)//x is one of stable equilibrium. Any initial value xi except 0 or 1 
converges to x by the iteration. The curve x(/i) is shown in Fig. 6. 19, in the domain 
1 < At < 3. 

At /x = /xo = 3 this point becomes marginally stable. Choosing x\ — x + 8 
and linearizing (6.67), the image of x\ is found in X 2 = x — <5, and vice versa. 
If we think of x\,X 2 , ■ ■ • as points of intersection of an orbit with a transverse 
section, then we have exactly the situation described in Sect. 6.3.6 with one of the 
characteristic multipliers crossing the value — 1 . The orbit oscillates back and forth 
between x\ = x + 8 and X 2 — x — 8, i.e. it has acquired twice the period of the 
original orbit, which goes through x. Clearly, this tells us that 

Oo = 3 , x 0 = x(jiq)) ( 6 . 68 ) 

is a bifurcation. In order to determine its nature, we investigate the behavior for 
At > a^o- As we just saw, the point x — (/x — l)//x becomes unstable and there 
is period doubling. This means that stable fixed points no longer fulfill the con- 
dition x — /(At, x) but instead return only after two steps of the iteration, i.e. 
Jc = /(At, /(At, x)). Thus, we must study the mapping / o /, that is the iteration 

Xi + 1 = fi 2 Xi(l - Xi)[\ - fixj ( 1 - X,-)] , (6.69) 

^ This equation takes its name from its use in modeling the evolution of, e.g., animal population 
over time, as a function of fecundity and of the physical limitations of the surroundings. The 
former would lead to an exponential growth of the population, the latter limits the growth, the 
more strongly the bigger the population. If A n is the population in the year n, the model calculates 
the population the following year by an equation of the form A n +\ = rA n { 1 — A n ) where r 
is the growth rate, and (1 — A n ) takes account of the limitations imposed by the environmental 
conditions. See e.g. hypertextbook.com/chaos/42.shtml. 
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Fig. 6.19. Numerical results for large number of iterations of the logistic equation (6.67). The first 
bifurcation occurs at (/xq = 3,.fo = 2/3), the second at ji\ = 1 + V6, etc. The range of // shown 
is 1 < /x < 4 



and find its fixed points. Indeed, if one sketches the function g = f o /, one 
realizes immediately that it possesses two stable equilibria. This is seen also in 
Fig. 6.19, in the interval 3 < /x < 1 + 76 ~ 3.449. Returning to the function 
/, this tells us that the iteration (6.67) alternates between the two fixed points of 
g = f o /. If we interpret the observed pattern as described in Sect. 6.3.6 above, 
we realize that the bifurcation (6.68) is of the “pitchfork” type shown in Fig. 6.14. 

The situation remains stable until we reach the value ji\ — 1 + 7b of the 
control parameter. At this value two new bifurcation points appear: 

(/xi = 1 + 76, ici/2 = ^(4 + 76 ± (273 - 72 ))) . (6.70) 

At these points the fixed points of g = f o / become marginally stable, while for 
fi > fi\ they become unstable. Once more the period is doubled and one enters 
the domain where the function 

h=gog = fofofof 

possesses four stable fixed points. Returning to the original function /, this means 
that the iteration visits these four points alternately, in a well-defined sequence. 

This process of period doublings 27\ 47’, 87’. . . . and of pitchfork bifurcations 
continues like a cascade until // reaches the limit point 

/x 00 = 3.56994.... (6.71) 

This limit point was discovered numerically (Feigenbaum 1979). The same is true 
for the pattern of successive bifurcation values of the control parameter, for which 
the following regularity was found empirically. The sequence 
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lim — ~ /X/ ~ 1 = s (6.72) 

i^oo 1 - Hi 

has the limit <5 = 4.669201 609 . . . (Feigenbaum 1979), which is found to be uni- 
versal for sufficiently smooth families of iterative mappings (6.60). 

For h > Hoo the system shows a structurally new behavior, which can be fol- 
lowed rather well in Figs. 6.21 to 6.24. The figures show the results of the iteration 
(6.67) obtained on a computer. They show the sequence of iterated values x, for i > 
i n with i n chosen large enough that transients (i.e. initial, nonasymptotic states of 
oscillations) have already died out. The iterations shown in Figs. 6.19-21 pertain to 
the range 1001 < i < 1200, while in Figs. 6.22-24 that range is 1001 < i < 2000. 
This choice means the following: initial oscillations have practically died out and 
the sequence of the X( lie almost entirely on the corresponding attractor. The density 
of points reflects approximately the corresponding invariant measure on the respec- 
tive attractor. Figure 6.22 is a magnified section of Fig. 6.21 (the reader should mark 
in Fig. 6.21 the window shown in Fig. 6.22). Similarly, Fig. 6.23 is a magnified sec- 




2.80 2.90 3.00 3.10 3.20 3.30 3.40 3.50 3.60 3.70 3.80 3.90 



Fig. 6.20. In this figure the domain of pitchfork bifurcations and period doubling up to about 16 T 
as well as the window of period 3 are clearly visible. Range shown: 2.8 < /i < 4 




Fig. 6.21. Range shown is 3.7 < /x < 3.8 in a somewhat expanded representation. The window with 
period 5 is well visible 





Fig. 6.24. Here one sees a magnification of the periodic window in the right-hand half of Fig. 6.23. 
The window shown corresponds to 0.47 < x < 0.53 and 3.7440 < fi < 3.7442 
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tion of Fig. 6.22. The number of iterations was chosen such that one may compare 
the average densities on these figures directly with those in Figs. 6.19-21 l0 . 

The figures show very clearly that once p exceeds the limit value p^ (6.71) 
there appear domains of chaotic behavior, which, however, are interrupted repeat- 
edly by strips with periodic attractors. In contrast to the domain below p^, which 
shows only periods of the type 2" , these intermediate strips also contain sequences 
of periods 

P 2 n ,p3 n ,p5 n with p = 3,5,6,.... 

Figures 6.22 and 6.23 show the example of the strip of period 5, in the neigh- 
borhood of p = 3.74. A comparison of Figs. 6.23 and 6.20 reveals a particularly 
startling phenomenon: the pattern of the picture in the large is repeated in a sec- 
tional window in the small. 

A closer analysis of the irregular domains show that here the sequence of iter- 
ations {.r,} never repeats. In particular, initial values xi and x' x always drift apart, 
for large times, no matter how close they were chosen. These two observations hint 
clearly at the chaotic structure of these domains. This is confirmed explicitly, e.g., 
by the study of the iteration mapping (6.67) close to p — 4. For the sake of sim- 
plicity we only sketch the case p — 4. It is not difficult to verify that the mapping 

f(p = 4, x) = 4.r(l — x) 
has the following properties. 

(i) The points x\ < X 2 of the interval [0, are mapped onto points x\ < x ' 2 of 
the interval [0, 1], In other words, the first interval is expanded by a factor 2, the 
relative ordering of the preimages remains unchanged. Points X3 < X 4 taken from 
1 1] are mapped onto points x 2 and x ' 4 of the expanded interval [0, 1]. However, 
the ordering is reversed. Indeed, with A'3 < X 4 one finds x ' 3 > x' 4 . The observed 
dilatation of the images tells us that the distance S of two starting values increases 
exponentially, in the course of the iteration. This in turn tells us that one of the cri- 
teria for chaos to occur is fulfilled: there is extreme sensitivity to initial conditions. 

(ii) The change of orientation between the mappings of [0, j] and [4,1] onto 
the interval [0, 1], tells us that an image jcj+i, in general, has two distinct preim- 
ages, xi e [0, 7] and x' e [5, 1], (The reader should make a drawing in order to 
convince him or herself.) Thus, if this happens, the mapping ceases to be invert- 
ible. x/- 1-1 has two preimages, each of which has two preimages too, and so on. 
It is not possible to reconstruct the past of the iteration. Thus, we find another 
criterion for chaotic pattern to occur. 

One can pursue further the discussion of this dynamical system, which seems 
so simple and yet which possesses fascinating structures. For instance, a classifi- 
cation of the periodic attractors is of interest that consists in studying the sequence 
in which the stable points are visited, in the course of the iteration. Fourier analy- 
sis and, specifically, the behavior of the correlation functions (6.65) and (6.66) in 

10 I thank Peter Beckmann for providing these impressive figures and for his advise regarding the 
presentation of this system. 
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the chaotic zones are particularly instructive. The (few) rigorous results as well as 
several conjectures for iterative mappings on the unit interval are found in the book 
by Collet and Eckmann (1990). For a qualitative and well-illustrated presentation 
consult Berge, Pomeau, and Vidal (1987). 



6.5 Quantitative Measures of Deterministic Chaos 

6.5.1 Routes to Chaos 

The transition from a regular pattern of the solution manifold of a dynamical sys- 
tem to regimes of chaotic motion, as a function of control parameters, can happen 
in various ways. One distinguishes the following routes to chaos. 

(i) Frequency doubling. The phenomenon of frequency doubling is characteris- 
tic for the interval 1 < /x < Poo = 3.56994 of the logistic equation (6.67). Above 
the limit value Hoc the iterations (6.67) change in a qualitative manner. A more 
detailed analysis shows that periodic attractors alternate with domains of genuine 
chaos, the chaotic regimes being characterized by the observation that the iteration 
x n i-> x n +\ = f(p,x n ) yields an infinite sequence of points that never repeats 
and that depends on the starting value x\ . This means, in particular, that sequences 
starting at neighboring points x\ and xj eventually move away from each other. 
Our qualitative analysis of the logistic mapping with p close to 4 in Sect. 6.4.3 (i) 
and (ii) showed how this happens. The iteration stretches the intervals [0, j] and 
[j, 1] to larger subintervals of [0, 1] (for p = 4 this is the full interval). It also 
changes orientation by folding back the values that would otherwise fall outside 
the unit interval. As we saw earlier, this combination of stretching and back-folding 
has the consequence that the mapping becomes irreversible and that neighboring 
starting points, on average, move apart exponentially. Let x\ and x\ be two neigh- 
boring starting values for the mapping (6.67). If one follows their evolution on a 
calculator, one finds that after n iterations their distance is given approximately by 

\x' n — x„\ — e ln \x[ — xi\ . (6.73) 

The factor X in the argument of the exponential is called the Liapunov characteris- 
tic exponent. Negative X is characteristic for a domain with a periodic attractor: the 
points approach each other independently of their starting values. If X is positive, 
on the other hand, neighboring points move apart exponentially. There is extreme 
sensitivity to initial conditions and one finds a chaotic pattern. Indeed, a numerical 
study of (6.67) gives the results (Berge, Pomeau and Vidal 1987) 

for p = 2.8 , X — —0.2 , 

for p = 3.8 , X — +0.4 . (6.74) 

(ii) Intermittency . In Sect. 6.3.6 we studied the Poincare mapping at the tran- 
sition from stability to instability for the case where one of the eigenvalues of 
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DTI (so) crosses the unit circle at —1. There are other possibilities for the tran- 
sition from stability to instability, (a) As a function of the control parameter, an 
eigenvalue can leave the interior of the unit circle at +1. (b) Two complex conju- 
gate eigenvalues c(p)e. ±l< ^ 1 ' 11 ^ leave the unit circle along the directions (p and — </>. 
All three situations play their role in the transition to chaos. In case (a) one talks 
about intermittency of type I, in case (b) about intermittency of type II, while the 
first case above is also called type-III intermittency. We wish to discuss type I in 
a little more detail. 

Figures 6.19 and 6.20 show clearly that at the value p = p c = 1 + V8 ~ 3.83 
a new cycle with period 3 is born. Therefore, let us consider the triple iteration 
h(p, x) — fo f o / . Figure6.25 shows that the graph of h(p — p c , x) is tangent to 
the straight line y — x in three points .f (1 \ x^ 2 \ x (3) . Thus, at these points, we have 

h(p c ,x (i) )=x (i) , ^-h(p c ,x (i) ) = 1 . 

ax 

In a small interval around p c and in a neighborhood of any one of the three fixed 
points, h must have the form 

h(p, x) — x (l) + (x — x (,) ) + a{x — + f(p — p c ) ■ 




Fig. 6.25. Graph of the threefold iterated map- 
ping (6.67) for fL = n c = 1 + \/8. The func- 
tion h = f o f o f is tangent to the straight 
1 line in three points 



We study the iterative mapping x n +\ = h(fz,x n ) in this approximate form, i.e. 
we register only every third iterate of the original mapping (6.67). We take z = 
a,(x — x^) and obtain 

Z n + 1 = Zn + zl + t] (6.75) 

with r] — af(p — p c ). The expression (6.75) holds in the neighborhood of any of 
the three fixed points of h(p,x). For negative (6.75) has two fixed points, at 
Z~ — —yf—t) and at z+ = y/—t], the first of which is stable, while the second is 
unstable. For t] — 0 the two fixed points coincide and become marginally stable. 
For small, but positive t], a new phenomenon is observed as illustrated in Fig. 6.26. 
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Fig. 6.26. The iterative mapping (6.75) with 
small positive rj spends a long time in the 
narrow channel between the curve w = z + 
Z~ + t) and the straight line w = z 



Iterations with a negative starting value of the variable z, move for a long time 
within the narrow channel between the graph of the function z + z 2 + q and the 
straight line w = z- As long as z is small the behavior is oscillatory and has nearly 
the same regularity as with negative values of q. This phase of the motion is said 
to be the laminar phase. When |z| increases, the iteration quickly moves on to 
a chaotic or turbulent phase. However, the motion can always return to the first 
domain, i.e. to the narrow channel of almost regular behavior. Practical models 
such as the one by Lorenz (see e.g. Berge, Pomeau, Vidal 1987) that contain this 
transition to chaos indeed show regular oscillatory behavior interrupted by bursts 
of irregular and chaotic behavior. 

For small |z| the iteration remains in the channel around z = 0 for some finite 
time. In this case successive iterates lie close to each other so that we can replace 
(6.75) by a differential equation. Replacing z n + 1 — Z n by dz/dn, we obtain 

dz 9 

— = q + z 2 . (6.76) 

dn 

[Note that this is our equation (6.54) with a destabilizing nonlinearity, which de- 
scribes then a subcritical saddle-node bifurcation.] Equation (6.76) is integrated at 
once, 

z(n ) = V^tg (y/rj(n - no)) . 

hq is the starting value of the iteration and may be taken to zero, without restriction. 
This explicit solution tells us that the number of iterations needed for leaving the 
channel is of the order of n ~ it /2^/Tj. Hence, 1 / Jq is a measure of the time that 
the system spends in the laminar regime. Finally, one can show that the Liapunov 
exponent is approximately X ~ Jr), for small values of q. 

(iii) Quasiperiodic motion with nonlinear perturbation. A third route to chaos 
may be illustrated by Example (ii) of Sect. 6.3.4. We consider quasiperiodic mo- 
tion on the torus T 2 . choosing the Poincare section as described in (6.49'), i.e. we 
register the points of intersection of the orbit with the transverse section of the 
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torus at 9 1 = 0(mod27r). When understood as an iterative mapping, the second 
equation (6.49') reads 

( U)l \ 

B n + 27T — 1 mod 2jt , 

Ml) 

where we write 9 instead of Oj, for the sake of simplicity. 

Let us perturb this quasiperiodic motion on the torus by adding a nonlinearity 
as follows: 

9 n +\ — ( 0„ + 2jz— + k sin B„ ) mod 2n . (6.77) 

V OJ, ) 

This model, which is due to Arnol’d, contains two parameters: the winding num- 
ber /3 — ( 02 /coi, with 0 < J8 < 1, and the control parameter at, which is taken 
to be positive. For 0 < k < 1 the derivative of (6.77), I + k cos 9 n , has no zero 
and hence the mapping is invertible. For k > 1, however, this is no longer true. 
Therefore, k — 1 is a critical point where the behavior of the flow on T 2 changes 
in a qualitative manner. Indeed, one finds that the mapping (6.77) exhibits chaotic 
behavior for k > 1 . This means that in crossing the critical value k — 1 from reg- 
ular to irregular motion, the torus is destroyed. As a shorthand let us write (6.77) 
as follows: 0 n +\ = f(fi, k , 9 n ). The winding number is defined by the limit 

w(P, k)= lim (6.78) 

n->0 o 2 Jin 

Obviously, for k — 0 it is given by w(J3, 0) = ft — cl> 2 /ol>\. The chaotic regime 
above k — 1 may be studied as follows. For a given value of k we choose 
/i = p n (k) such that the starting value 9o — 0 is mapped to 2jip„, after q n steps, 
q„ and p n being integers, 

f^(P,K,0) = 2np n . 

The winding number is then a rational number. k) = p n /dn = r n , r n e Q. 
This sequence of rationals may be chosen such that r n tends to a given irrational 
number r, in the limit n — > 00 . An example of a very irrational number is the 
Golden Mean. Let r n = F n /F n+ \, where the F n are the Fibonacci numbers , de- 
fined by the recurrence relation F n+ 1 = F n + F n _ 1 and the initial values Fo = 0, 
F\ — 1. Consider 

r Fn 1 

F n + 1 l + r n -\ 

in the limit n — > oo. Hence r = 1/(1 + r). The positive solution of this equation 
is the Golden Mean r = (V5 — 1 ) / 2 1 1 . 

11 The Golden Mean is a well-known concept in the fine arts, in the theory of proportions. For 
example, a column of height H is divided into two segments of heights h\ and /i 2 , with H = 
h\ + /?2 such that the proportion of the shorter segment to the longer is the same as that of 
the longer to the column as a whole, i.e. h\/h 2 = /i 2 /H = /*2/(^l + ^2)- The ratio h\/h 2 = 
r = (>/5— l)/2 is the Golden Mean. This very irrational number has a remarkable continued 
fraction representation: r = 1/(1 + 1/(1 + ... 
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With this choice, the winding numbers defined above, w(/3 n (K ), k) — r n , will 
converge to w — r. A numerical study of this system along the lines described 
above reveals remarkable regularities and scaling properties that are reminiscent of 
the logistic mapping (6.67) (see e.g. Guckenheimer and Holmes (1990), Sect. 6.8.3 
and references quoted there). 



6.5.2 Liapunov Characteristic Exponents 



Chaotic behavior is observed whenever neighboring trajectories, on average, di- 
verge exponentially on attractors. Clearly, one wishes to have a criterion at hand 
that allows one to measure the speed of this divergence. Thus, we consider a so- 
lution 0(t, y ) of the equation of motion (6.1), change its initial condition by the 
amount 8y, and test whether, and if yes at what rate, the solutions 0(t, y) and 
0(1 . y + <5y) move apart. In linear approximation their difference obeys (6.1 1), i.e. 

8$ = 0(t, y + 8y ) - <P(t, y) = A(t)[0(t, y + 8y) - 0{t, y)] 

= A(t)80 , (6.79) 

the matrix A(t) being given by 



Mt) = 




0 



Unfortunately, (6.79), in general, cannot be integrated analytically and one must 
resort to numerical algorithms, which allow the determination of the distance of 
neighboring trajectories as a function of time. Nevertheless, imagine we had solved 
(6.79). At t = 0 we have 80 = 0(0, y + 8y) — 0(0, y) = 8y. For t > 0 let 



80 (t) = U (?) • 8y 



(6.80) 



be the solution of the differential equation (6.79). From (6.79) one sees that the 
matrix U (t) itself obeys the differential equation 

U(r) = A(t)\i(t) 



and therefore may be written formally as follows: 



U(t) = exp 




U(0) , 



with U(0) = 11 . 



(6.81) 



Although this is generally not true, imagine the matrix to be independent of time. 
Let {/^ ) denote its eigenvalues (which may be complex numbers) and use the basis 
system of the corresponding eigenvectors. Then U(f) = {exp(A^-t )} is also diagonal. 
Whether or not neighboring trajectories diverge exponentially depends on whether 
or not the real part of one of the eigenvalues Re A.* = ^(A.* + A|) is positive. This 
can be tested by taking the logarithm of the trace of the product U'U, 
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ilnTr(UV)U(f)) = ^ InTr (exp{(A* + X* k )t}) , 

and by letting t go to infinity. In this limit only the eigenvalue with the largest 
positive real part survives. With this argument in mind one defines 

Mi = lim — InTr (U f (f)U(f)) . (6.82) 

t^OO 2 1 

The real number Ml is called the leading Liapunov characteristic exponent. It pro- 
vides a quantitative criterion for the nature of the flow: whenever the leading Li- 
apunov exponent is positive, there is (at least) one direction along which neigh- 
boring trajectories move apart, on average, at the rate exp(/r 1 1 ). There is extreme 
sensitivity to initial conditions: the system exhibits chaotic behavior. 

The definition (6.82) applies also to the general case where A{t) depends on 
time. Although the eigenvalues and eigenvectors of A(t) now depend on time, 
(6.82) has a well-defined meaning. Note, however, that the leading exponent de- 
pends on the reference solution @(t, y). 

The definition (6.82) yields only the leading Liapunov exponent. If one wishes 
to determine the next to leading exponent m 2 < Mi> one must take out the direction 
pertaining to Ml and repeat the same analysis as above. Continuing this procedure 
yields all Liapunov characteristic exponents, ordered according to magnitude. 



Mi > m > ■ ■ ■ > ft/ • 



(6.83) 



The dynamical system exhibits chaotic behavior if and only if the leading Liapunov 
exponent is positive. 

For discrete systems in / dimensions, x n +\ — F(x n ), x e R', the Liapunov 
exponents are obtained in an analogous fashion. Let i/ 1 1 be those vectors in the tan- 
gent space at an arbitrary point .v that grow at the fastest rate under the action of the 
linearization of the mapping F, i.e. those for which (DF(x))” v 1 1 ’ is largest. Then 

Mt = lim -ln|(DF(.r))"v (1) | . (6.84) 

w— >oo n 

In the next step, one determines the vectors v 1 2 ’ that grow at the second fastest 
rate, leaving out the subspace of the vectors v ( 1 1 . The same limit as in (6.84) yields 
the second exponent M 2 etc. We consider two simple examples that illustrate this 
procedure. 

(i) Take F to be two-dimensional and let x° be a fixed point, = F(x°). 
D F(x°) is diagonalizable. 



DF(x°) = 



ki 

0 



0 

x 2 



with, say, kj > X 2 . Choose i> (1) from {]R 2 \2-axis}, i.e. in such a way that its 1- 
component does not vanish, and choose i; (2i along the 2-axis, 
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V 



(1) _ 




,d) 



#0; 



v< 2 > = 



a® 



a 



( 2 ) _ 



= 0 . 



Then one obtains 



1 

Hi = lim - In 

n-*o o n 



((MY 1 

V 0 



° ) 

( a .2 r ) 



= lim - In |aH" = ln|A.,| 

n—>o o yi 




and /x i > fj. 2 - 

(ii) Consider a mapping in two dimensions, x — (“), that is defined on the 
unit square 0 < u < 1, 0 < v < 1, by the equations 



u n + 1 = 2n„(tnod 1) , 



(6.85a) 



I a v n for 0 < u n < j 

av n + ^ for j < u„ < 1 



(6.85b) 



with a < 1. Thus, in the direction of v this mapping is a contraction for u < \ 
and a contraction and a shift for u . In the direction of u its effect is stretching 
and back-bending whenever the unit interval is exceeded. (It is called the baker’s 
transformation because of the obvious analogy to kneading, stretching, and back- 
folding of dough.) This dissipative system is strongly chaotic. This will become 
clear empirically if the reader works out the example a = 0.4 on a PC, by following 
the fate of the points on the circle with origin (j, j) and radius a, under the action 
of successive iterations. The original volume enclosed by the circle is contracted. 
At the same time horizontal distances (i.e. parallel to the 1-axis) are stretched ex- 
ponentially because of (6.85a). The system possesses a strange attractor, which is 
stretched and folded back onto itself and which consists of an infinity of horizontal 
lines. Its basin of attraction is the whole unit square. Calculation of the Liapunov 
characteristic exponents by means of the formula (6.84) gives the result 



Ii \ — In 2 , H 2 — In \a\ , 



and thus yu -2 < 0 < yu, i . 



6.5.3 Strange Attractors 

Example (ii) of the preceding section shows that the system (6.85a) lands on a 
strangely diffuse object which is neither an arc of a curve nor a piece of a surface 
in the unit square, but somehow “something in between”. This strange attractor 
does indeed have zero volume, but its geometric dimension is not an integer. Geo- 
metric structures of this kind are said to be fractals. Although a rigorous discussion 
of this concept and a detailed analysis of fractal-like strange attractors is beyond 
the scope of this book, we wish at least to give an idea of what such objects are. 
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Imagine a geometric object of dimension cl embedded in a space R", where 
d need not necessarily be an integer. Scaling all its linear dimensions by a factor 
X, the object’s volume will change by a factor k — X d , i.e. 

In a: 

d = . 

In X 



Clearly, for points, arcs of curves, surfaces, and volumes in M 3 one finds in this 
way the familiar dimensions d — 0, d = 1, d — 2 and d = 3, respectively. A 
somewhat more precise formulation is the following. A set of points in R" , which 
is assumed to lie in a finite volume, is covered by means of a set of elementary 
cells B whose diameter is s. These cells may be taken to be little cubes of side 
length e, or little balls of diameter e, or the like. If N(s) is the minimal num- 
ber of cells needed to cover the set of points completely, the so-called Hausdorff 
dimension of the set is defined to be 



du — 



In (A (e) 

lim , 

c — >0 ln(l/e) 



( 6 . 86 ) 



provided this limit exists. To cover a single point, one cell is enough, N(s ) = 1; to 
cover an arc of length L one needs at least N(e) = L/e cells; more generally, to 
cover a p-dimensional smooth hypersurface F, N ( e ) = F/e p cells will be enough. 
In these cases, the definition (6.86) yields the familiar Euclidean dimensions du = 
0 for a point, d\\ = 1 for an arc, and d\\ = p for the hypersurface F with p < n. 

For fractals, on the other hand, the Hausdorff dimension is found to be nonin- 
teger. A simple example is provided by the Cantor set of the middle third, which is 
defined as follows. From a line segment of length 1 one cuts out the middle third. 
From the remaining two segments [0. j] and [|, 1] one again cuts out the mid- 
dle third, etc. By continuing this process an infinite number of times one obtains 
the middle third Cantor set. Taking eo = 5, the minimum number of intervals of 
length eo needed to cover the set is N(eo) — 2. If we take e 1 = 5 instead, we 
need at least A(ei) = 4 intervals of length s \ , etc. For e„ = 1/3" the minimal 
number is N(e n ) = 2". Therefore, 



In 2" In 2 

cfn = lim — — = — — ~ 0.631 . 
n^o o In 3" In 3 



Another simple and yet interesting example is provided by the so-called snow- 
flake set, which is obtained by the following prescription. One starts from an equi- 
lateral triangle in the plane. To the middle third of each of its sides one adds another 
equilateral triangle, of one third the dimension of the original one, and keeps only 
the outer boundary. One repeats this procedure infinitely many times. The object 
generated in this way has infinite circumference. Indeed, take the side length of 
the initial triangle to be 1. At the nth step of the construction described above 
the side length of the last added triangles is s n — 1/3". Adding a triangle to the 
side of length e n -\ breaks its up in four segments of length s„ each. Therefore, 
the circumference is C n — 3 x 4" x e n — 4"/3" -1 . Clearly, in the limit n -> 00 
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this diverges, even though the whole object is contained in a finite portion of the 
plane. On the other hand, if one calculates the Hausdorff dimensions in the same 
way as for the middle third Cantor set one finds = In4/ln3 ~ 1.262. 

There are further questions regarding chaotic regimes of dynamical systems, 
such as: If strange attractors have the structure of fractals, can one measure their 
generalized dimension? Is it possible to describe deterministically chaotic motion 
on the attractor quantitatively by means of a test quantity (some kind of entropy), 
which would tell us whether the chaos is rich or poor? These questions lead us 
beyond the range of the tools developed in this book. In fact, they are the subject of 
present-day research and no final answers have been given so far. We refer to the 
literature quoted in the Appendix for an account of the present state of knowledge. 



6.6 Chaotic Motions in Celestial Mechanics 

We conclude this chapter on deterministic chaos with a brief account of some fas- 
cinating results of recent research on celestial mechanics. These results illustrate in 
an impressive way the role of deterministically chaotic motion in our planetary sys- 
tem. According to the traditional view, the planets of the solar system move along 
their orbits with the regularity of a clockwork. To a very good approximation, the 
motion of the planets is strictly periodic, i.e, after one turn each planet returns to 
the same position, the planetary orbits are practically fixed in space relative to the 
fixed stars. From our terrestrial point of view no motion seems more stable, more 
uniform over very long time periods than the motion of the stars in the sky. It is pre- 
cisely the regularity of planetary motion that, after a long historical development, 
led to the discovery of Kepler’s laws and, eventually, to Newton’s mechanics. 

On the other hand, our solar system with its planets, their satellites, and the very 
many smaller objects orbiting around the sun is a highly complex dynamical sys- 
tem whose stability has not been established in a conclusive manner. Therefore, it 
is perhaps not surprising that there are domains of deterministically chaotic motion 
even in the solar system with observable consequences. It seems, for instance, that 
chaotic motion is the main reason for the formation of the Kirkwood gaps (these 
are gaps in the asteroid belts between Mars and Jupiter which appear at some ratio- 
nal ratios of the periods of revolution of the asteroid and Jupiter) and that chaotic 
motion also provides an important source for the transport of meteorites to the 
earth (Wisdom 1987). 

In this section we describe an example of chaotic tumbling of planetary satel- 
lites which is simple enough that the reader may reproduce some of the figures on 
a PC. We then describe some recent results regarding the topics mentioned above. 

6.6.1 Rotational Dynamics of Planetary Satellites 

The moon shows us always the same face. This means that the period of its spin 
(its intrinsic angular momentum) is equal to the period of its orbital motion and 
that its axis of rotation is perpendicular to the plane of the orbit. In fact, this is its 
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final stage, which was reached after a long-term evolution comprising two phases: 
a dissipative phase, or slowing-down phase, and the final, Hamiltonian phase that 
we observe today. Indeed, although we ignore the details of the moon’s formation, 
it probably had a much faster initial rate of rotation and its axis of rotation was 
not perpendicular to the plane of the orbit. Through the action of friction by tidal 
forces, the rotation was slowed down, over a time period of the order of the age of 
the planetary system, until the period of rotation became equal to the orbital pe- 
riod. At the same time the axis of rotation turned upright such as to point along the 
normal to the orbit plane. These results can be understood on the basis of simple 
arguments regarding the action of tidal forces on a deformable body and simple 
energy considerations. In the synchronous phase of rotation (i.e. spin period equal 
to orbital period) the effect of tidal forces is minimal. Furthermore, for a given 
frequency of rotation, the energy is smallest if the rotation takes place about the 
principal axis with the largest moment of inertia. Once the satellite has reached 
this stage, the motion is Hamiltonian, to a very good approximation. 

Thus, any satellite close enough to its mother planet that tidal forces can mod- 
ify its motion in the way described above and within a time period comparable to 
the age of the solar system will enter this synchronous phase, which is stable in 
the case of our moon. This is not true, as we shall see below, if the satellite has 
a strongly asymmetric shape and if it moves on an ellipse of high eccentricity. 

The Voyager 1 and 2 space missions took pictures of Hyperion, one of the far- 
thest satellites of Saturn, on passing close to Saturn in November 1980 and August 
1981. Hyperion is an asymmetric top whose linear dimensions were determined 
to be 

190 km x 145 km x 114 km 

with an uncertainty of about ±15 km. The eccentricity of its elliptical orbit is 
e = 0.1; its orbital period is 21 days. The surprising prediction is that Hyperion 
performs a chaotic tumbling motion in the sense that its angular velocity and the 
orientation of its axis of rotation are subject to strong and erratic changes within 
a few periods of revolution. This chaotic dance, which, at some stage, must have 
also occurred in the history of other satellites (such as Phobos and Deimos, the 
companions of Mars), is a consequence of the asymmetry of Hyperion and of the 
eccentricity of its orbit. This is what we wish to show within the framework of a 
simplified model. 

The model is shown in Fig. 6.27. Hyperion H moves around Saturn S on an 
ellipse with semimajor axis a and eccentricity e. We simulate its asymmetric shape 
by means of four mass points 1 to 4, that have the same mass m and are arranged 
in the orbital plane as shown in the figure. The line 2-1 (the distance between 2 
and 1 is d ) is taken to be the 1-axis, the line 4-3 (distance e < d) is taken to be 
the 2-axis. The moments of inertia are then given by 

7) = jine 2 < h— \tnd 2 < I 3 — \ m{d 2 ± e 2 ) . (6.87) 

As we said above, the satellite rotates about the 3-axis, i.e. the axis with the largest 
moment of inertia. This axis is perpendicular to the orbit plane (in Fig. 6.27 it points 




6.6 Chaotic Motions in Celestial Mechanics 



411 



towards the reader). It is reasonable to assume that Hyperion’s motion has no ap- 
preciable effect on Saturn, its mother planet, whose motion is very slow compared 
to that of Hyperion. 

The gravitational field at the position of Hyperion is not homogeneous. As I\ 
and h are not equal, the satellite is subject to a net torque that depends on its 
position in the orbit. We calculate the torque for the pair (1,2). The result for the 
pair (3,4) will then follow immediately. We have 

D a ' 2) = d - x (Fi-F 2 ) , 

where F j = — GmMri/r 3 is the force acting on the mass point i, M being the 
mass of Saturn. The distance d = \d\ being small compared to the radial distance 
r from Saturn we have, with the notations as in Fig. 6.27 , 



— t = — y 1 ± — cos a 

r \ y 



d 2 \ 
4r 2 / 



-3/2 



779 ) -7 lf;-COS« 



3 d 



(The upper sign holds for r\ , the lower sign for i' 2 -) Inserting this approximation 
as well as the cross product r x d — —rd sin a £ 3 , one finds 

£jd. 2 ) ^ Qd 2 mM G/4r 2 ) sin 2a £3 = (3GM/2/2r 3 ) sin2a £3 . 



In the second step we inserted the expression (6.87) for F- The product GM can 
be expressed by the semimajor axis a and the orbital period T, using Kepler’s third 
law (1.23). The mass of Hyperion (which in the model is 4/w) is small compared 
to M, and therefore it is practically equal to the reduced mass. So, from (1.23) 

GM = C iTt/Tfa 3 . 

The calculation is the same for the pair (3,4). Hence, the total torque D (1 ’ 2) + Z) (3 - 4 ) 
is found to be 




/ 27 r \ 2 / a \3 

(— J sin 2a 



e 3 . 



(6.88) 




Fig. 6.27. A simple model for the asymmetric satellite Hyperion of the planet Saturn 
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This result remains valid if the satellite is described by a more realistic distribution 
of mass. It shows that the resulting torque vanishes if I\ — / 2 . With this result the 
equation of motion (3.52) for the rotational motion of the satellite reads 

; 3 « = |(f) 2 te-; 1 ,(^) 3 s i n2a. (6.89) 

Here, the angle 0 describes the orientation of the satellite’s 1-axis relative to the 
line SP (joining Saturn and Hyperion’s perisaturnion) and <t> is the usual polar 
angle of Keplerian motion. As a — <t> — 0 , (6.89) reads 

he = (y) {h ~ h) (yy) sin2[0 “ 0{t)] ■ (6 - m 



This equation contains only one explicit degree of freedom, 0, but its right-hand 
side depends on time because the orbital radius r and the polar angle <P are func- 
tions of time. Therefore, in general, the system is not integrable. There is an ex- 
ception. however. If the orbit is a circle, s — 0 (cf. Sect. 1.7.2 (ii)) the average 
circular frequency 



def 277 




(6.90) 



is the true angular velocity, i.e. we have 0 — nt and with O' — 0 — nt the equation 
of motion becomes 



I3O ' — — |n 2 (/2 — h) sin 20' , e = 0 . 



(6.91) 



If we set 



def no' 
z 1 = 20 



2 def , 2 ^2 

CO — 3/7 



h 



h 



def 

= cot 



(6.91) is recognized to be the equation of motion (1.40) of the plane pendulum, 
viz. d 2 zi/dr 2 = — sinzi, which can be integrated analytically. The energy is an 
integral of the motion; it reads 

E = \h0' 2 - |« 2 (/ 2 - /i)cos 20' . (6.92) 

If g ^ 0, the time dependence on the right-hand side of (6.89') cannot be elimi- 
nated. Although the system has only one explicit degree of freedom, it is intrinsi- 
cally three-dimensional. The early work of Henon and Heiles (1964) on the mo- 
tion of a star in a cylindrical galaxy showed that Hamiltonian systems may exhibit 
chaotic behavior. For some initial conditions they may have regular solutions, but 
for others the structure of their flow may be chaotic. A numerical study of the 
seemingly simple system (6.89'), which is Hamiltonian, shows that it has solu- 
tions pertaining to chaotic domains (Wisdom 1987, and original references quoted 
there). One integrates the equation of motion (6.89') numerically and studies the 
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result on a transverse section (cf. the Poincare mapping introduced in Sect. 6.3.4), 
which is chosen as follows. At every passage of the satellite at the point P of clos- 
est approach to the mother planet one records the momentary orientation of the 
satellite’s 1-axis with respect to the line SP of Fig. 6.27. One then plots the relative 
change (dO/dt)/n of the orientation at successive passages through P, for various 
initial conditions. One obtains figures of the kind shown in Figs. 6.28-30. We start 
by commenting on Fig. 6.28. One-dimensional manifolds, i.e. curves, correspond 
to quasiperiodic motion. If, on the other hand, the “measured” points fill a surface, 
this is a hint that there is chaotic motion. The scattered points in the middle part 
of the figure all pertain to the same, chaotic orbit. Also, the two orbits forming an 
“X” at about (j, 2.3) are chaotic, while the islands in the chaotic zones correspond 
to states of motion where the ratio of the spin period and the period of the orbit are 
rational. For example, the island at (0, 0.5) is the remnant of the synchronous mo- 
tion where Hyperion, on average, would always show the same face to the mother 



Hyperion Surface of Section 




Fig. 6.28. Chaotic behavior of Hy- 
perion, a satellite of Saturn. The 
picture shows the relative change 
of orientation of the satellite as a 
function of its orientation, at every 
passage in P, the point of closest 
approach to Saturn (from Wisdom 
1987) 




Fig. 6.29. Analogous result to 
the one shown in Fig. 6.28, for 
Deimos, a satellite of Mars whose 
asymmetry (6.93) is a = 0.81 and 
whose orbital eccentricity is e = 
0.0005 (from Wisdom 1987) 
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Fig. 6.30. Analogous result to 
those in Figs. 6.28 and 6.29 for 
Phobos, a satellite of Mars whose 
asymmetry (6.93) is a = 0.83 and 
whose orbital eccentricity is e = 
0.015 (from Wisdom 1987) 



planet. The synchronous orbit at 9 = n would be the one where the satellite shows 
the opposite face. The curves at the bottom of the figure and in the neighborhood of 
0 — j are quasiperiodic motions with an irrational ratio of periods. (It is not diffi- 
cult to see that the range it < 0 < lit is equivalent to the one shown in the figure.) 

A more detailed analysis shows that both in the chaotic domain and in the 
synchronous state the orientation of the spin axis perpendicular to the orbit plane 
is unstable. One says that the motion is attitude unstable. This means that even a 
small deviation of the spin axis from the vertical (the direction perpendicular to the 
orbit plane) will grow exponentially, on average. The time scale for the ensuing 
tumbling is of the order of a few orbital periods. The final stage of a spherically 
symmetric moon, as described above, is completely unstable for the asymmetric 
satellite Hyperion. Note, however, that once the axis of rotation deviates from the 
vertical, one has to solve the full set of the nonlinear Eulerian equations (3.52). 
In doing this one finds, indeed, that the motion is completely chaotic: all three 
Liapunov characteristic exponents are found to be positive (of the order of 0.1). 
In order to appreciate the chaoticity of Hyperion’s tumbling the following remark 
may be helpful. Even if one had measured the orientation of its axis of rotation to 
ten decimal places, at the time of the passage of Voyager 1 in November 1980, it 
would not have been possible to predict the orientation at the time of the passage 
of Voyager 2 in August 1981, only nine months later. 

Up to this point tidal friction has been completely neglected and the system is 
exactly Hamiltonian. Tidal friction, although unimportant in the final stage, was 
important in the history of Hyperion. Its evolution may be sketched as follows 
(Wisdom 1987). In the beginning the spin period presumably was much shorter 
and Hyperion probably began its evolution in a domain high above the one shown 
in Fig. 6.28. Over a time period of the order of the age of our solar system the spin 
rotation was slowed down, while the obliquity of the axis of rotation with respect 
to the vertical decreased to zero. Once the axis was vertical, the assumptions on 
which the model (6.89') and the results shown in Fig. 6.28 are based came close 
to being realized. However, as soon as Hyperion entered the chaotic regime, “the 
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work of the tides over aeons was undone in a matter of days” (Wisdom 1987). It 
began to tumble erratically until this day 12 . 

In order to understand further the rather strange result illustrated by Fig. 6.28 
for the case of Hyperion, we show the results of the same calculation for two satel- 
lites of Mars in Figs. 6.29 and 6.30: Deimos and Phobos. The asymmetry parameter 



a = 



3(/ 2 -/i) 



h 



(6.93) 



which is the relevant quantity in the equation of motion (6. 89') and whose value 
is 0.89 in the case of Hyperion, is very similar for Deimos and Phobos: 0.81 and 
0.83, respectively. However, the eccentricities of their orbits around Mars are much 
smaller than for Hyperion. They are 0.0005 for Deimos and 0.015 for Phobos. The 
synchronous phase at (6 = 0, ( I / n)d()/dt = 1) that we know from our moon is 
still clearly visible in Figs. 6.29 and 6.30, while in the case of Hyperion it has 
drifted down in Fig. 6.28. Owing to the smallness of the eccentricities, the chaotic 
domains are correspondingly less developed. Even though today Deimos and Pho- 
bos no longer tumble, they must have gone through long periods of chaotic tum- 
bling in the course of their history. One can estimate that Deimos’ chaotic tumbling 
phase may have lasted about 100 million years, whereas Phobos’ tumbling phase 
may have lasted about 10 million years. 

6.6.2 Orbital Dynamics of Asteroids with Chaotic Behavior 

As we learnt in Sect. 2.37 the manifold of motions of an integrable Hamiltonian 
system with / degrees of freedom is A* xH , with A-f — A[ x Z\ 2 x . . . x A f be- 
ing the range of the action variables / 1 , / 2 , . .. , If and T 1 the /-dimensional torus 
spanned by the angle variables 0\ , 0i , . . . , Of . Depending on whether or not the 
corresponding, fundamental frequencies are rationally dependent, one talks about 
resonant or nonresonant tori, respectively. These tori (the so-called KAM tori) and 
their stability with respect to small perturbations play an important role in pertur- 
bation theory of Hamiltonian systems, as explained in Sect. 2.39. 

In the past it was held that the Kirkwood gaps referred to in the introduction 
were due to a breakdown of the KAM tori in the neighborhood of resonances. 
It seems that this rather qualitative explanation is not conclusive. Instead, recent 
investigations of the dynamics of asteroids, which are based on long-term calcula- 
tions, seem to indicate that the Kirkwood gaps are due rather to chaotic behavior 
in a Hamiltonian system. 

Here we wish to describe briefly one of the examples studied, namely the gap 
in the asteroid belt between Mars and Jupiter, which occurs at the ratio 3:1 of the 
periods of the asteroid and of Jupiter. Clearly, the integration of the equations of 

1 - The observations of Voyager 2 are consistent with this prediction, since it found Hyperion in a 
position clearly out of the vertical. More recently, Hyperion’s tumbling was positively observed 
from the earth (J. Klavetter et al., Science 246 (1989) 998, Astron. J. 98 (1989) 1855). 
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t 



Fig. 6.31. Eccentricity of a typical orbit in the 
chaotic domain close to the 3:1 resonance, as a 
function of time measured in millions of years. 
From periods of small, though irregular, values 
of e the orbit makes long-term excursions to 
large values of the eccentricity 




Fig. 6.32. Surface of section for the orbit shown 
in Fig. 6.31. The radial coordinate of the points 
shown is the eccentricity 



motion over a time span of several millions of years is a difficult problem of applied 
mathematics for which dedicated methods had to be designed. We cannot got into 
these methods 13 and must restrict the discussion to a few characteristic results. 

The main result of these calculations is that the orbits of asteroids in the neigh- 
borhood of the 3:1 resonance exhibit chaotic behavior in the following sense: the 
eccentricity of the asteroid’s elliptic orbit varies in an irregular way, as a function of 
time, such that an asteroid with an initial eccentricity of, say, 0. 1 makes long excur- 
sions to larger eccentricities. Figure 6.31 shows an example for a time interval of 
2.5 million years, which was calculated for the planar system Sun-asteroid-Jupiter. 
The problem is formulated in terms of the coordinates (x. y ) of the asteroid in the 
plane and in terms of the time dependence of the orbit parameters due to the mo- 
tion of Jupiter along its orbit. Averaging over the orbital period yields an effective, 
two-dimensional system for which one defines effective coordinates 

x — s cos(o) — cb]) , y — s sin(&) — cbj) . (6.94) 

Here do and cbj are the longitudes of the perihelia for the asteroid and for Jupiter, 
respectively. The quantities (6.94) yield a kind of Poincare section if one records 
x and y each time a certain combination of the mean longitudes goes to zero. Fig- 
ure 6.32 shows the section obtained in this way for the orbit shown in Fig. 6.31. 
These figures show clearly that orbits in the neighborhood of the 3:1 resonance 
are strongly chaotic. At the same time they provide a simple explanation for the 
observation that a strip of orbits in the neighborhood of this resonance is empty: all 

13 The reader will find hints to the original literature describing these methods in Wisdom’s review 
(1987). 
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orbits with e > 0.4 cross the orbit of Mars. As we know that orbits in the neighbor- 
hood of the resonance make long excursions to larger eccentricities, there is a finite 
probability for the asteroids to come close to Mars, or even to hit this planet, and 
thereby to be scattered out of their original orbit. Thus, deterministically chaotic 
motion played an important role in the formation of the 3:1 Kirkwood gap 14 . 

Another, very interesting observation, which follows from these investigations, 
is that irregular behavior near the 3:1 resonance may play an important role in the 
transport of meteorites from the asteroid belt to the earth. Indeed, the calculations 
show that asteroidal orbits starting at s — 0.15 make long-term excursions to ec- 
centricities s — 0.6 and beyond. In this case they cross the orbit of the earth. 
Therefore, chaotic orbits in the neighborhood of the 3:1 gap can carry debris from 
collisions between asteroids directly to the surface of earth. In other words, deter- 
ministically chaotic motion may be responsible for an important transport mecha- 
nism of meteorites to earth, i.e. of objects that contain important information about 
the history of our solar system. 

In this last section we returned to celestial mechanics, the point of departure 
of all of mechanics. Here, however, we discovered qualitatively new types of de- 
terministic motion that are very different from the serene and smooth running of 
the planetary clockwork whose construction principles were investigated by Ke- 
pler. The solar system was always perceived as the prime example of a mechanical 
system evolving with great regularity and impressive predictability. We have now 
learnt that it contains chaotic behavior (tumbling of asymmetric satellites, chaotic 
variations of orbital eccentricities of asteroids near resonances, the chaotic motion 
of Pluto) very different from the harmony and regularity that, historically, one ex- 
pected to find. At the same time, we have learnt that mechanics is not a closed 
subject that has disappeared in the dusty archives of physics. On the contrary, it 
is more than ever a lively and fascinating field of research, which deals with im- 
portant and basic questions in many areas of dynamics. 



14 Analogous investigations of the 2:1 and 3:2 resonances indicate that there is chaotic behavior at 
the former while there is none at the latter. This is in agreement with the observation that there 
is a gap at 2:1 but not at 3:2. 
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A distinctive feature of the mechanical systems we have discussed so far is that 
their number of degrees of freedom is finite and hence countable. The mechanics 
of deformable macroscopic media goes beyond this framework. The reaction of a 
solid state to external forces, the flow behavior of a liquid in a force field, or the 
dynamics of a gas in a vessel cannot be described by means of finitely many coor- 
dinate variables. The coordinates and momenta of point mechanics are replaced by 
field quantities, i.e. functions or fields defined over space and time, which describe 
the dynamics of the system. The mechanics of continua is an important discipline 
of classical physics on its own and goes far beyond the scope of this book. In 
this epilog we introduce the important concept of dynamical field, generalize the 
principles of canonical mechanics to continuous systems, and illustrate them by 
means of some instructive examples. At the same time, this serves as a basis for 
electrodynamics, which is a typical and especially important field theory. 



7.1 Discrete and Continuous Systems 

Earlier we pointed out the asymmetry between the time variable on the one hand 
and the space variables on the other, which is characteristic for nonrelativistic 
physics, cf. Sects. 1.6 and 4.7. In a Galilei-invariant world, time has an absolute 
nature while space does not. In the mechanics of mass points and of rigid bod- 
ies there is still another asymmetry, which we also pointed out in Sect. 1.6 and 
which is this: time plays the role of a parameter, whereas the position r(t) of a 
particle, or, likewise, the coordinates {r s (t), 9k(t)} of a rigid body, or, even more 
generally, the flow 4>(t,to,xo) in phase space are the genuine, dynamical vari- 
ables that obey the mechanical equations of motion. Geometrically speaking, the 
latter are the “geometrical curves”, while t is the orbit parameter (length of arc) 
that indicates in which way the system moves along its orbits. 

This is different for the case of a continuous system, independent of whether 
it is to be described nonrelativistically or relativistically. Here, besides the time 
coordinate, also the space coordinates take over the role of parameters. Their pre- 
vious role as dynamical variables is taken over by new objects, the fields. It is the 
fields that describe the state of motion of the system and obey a set of equations 
of motion. We develop this important new concept by means of a simple example. 
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Example. Linear chain and vibrating string. Let n mass points of mass m be 
joined by identical, elastic springs in such a way that their equilibrium positions 
are x®, x®, ■ ■ ■ , x„, cf. Fig. 7.1. As shown in part (a) of that figure we displace 
the mass points along the straight line joining them. The deviations from the equi- 
librium positions are denoted by 

Uj(t) = xi(t ) — x® , i — 1, 2, . . . , n . 

The kinetic energy is given by 

T = Y\mu\ }. ( 7 . 1 ) 

i — 1 

The forces being harmonic the potential energy reads 

n—\ i ^ 

U = Y,2 k (w ' +1 “ U i)2 + 2 k ( M i + M ") • ( 7 -2) 

i = 1 



The last two terms stem from the spring connecting particle 1 with the wall and 
from the one connecting particle n with the wall, at the other end of the chain. We 
ascribe the coordinate x{] to the left suspension point and x® +1 to the right sus- 
pension point of the chain and we require that their deviations and their velocities 
be zero at all times, i.e. uo(t) = u n+ \ (t) = 0. The potential energy can then be 
written as 



1 " 

u = ~ k (lli+] 



Ui) 



i = 0 



(7.20 



and the natural form of the Lagrangian function reads 



L = T — U , 



(7.3) 



with T as given by (7.1) and U by (1 .2'). This Lagrangian function describes lon- 
gitudinal motions of the mass points, i.e. motions along their line of connection. 
We obtain the same form of the Lagrangian function if we let the mass points 
move only transversely to that line, i.e. as shown in Fig. 7.1b. Let d be the dis- 
tance between the equilibrium positions. The distance between neighboring mass 
points can be approximated as follows: 




fa+i - Vi ) 2 
d 



provided the differences of transverse amplitudes remain small compared to d. The 
force driving the mass points back is approximately transverse, its potential energy 
being given by 
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U, U 2 u 3 U n .! U n 

(a) 



Fig. 7.1. A linear chain of finitely 
many mass points, which may os- 
cillate (a) longitudinally or (b) 
transversely, (c) shows a vibrating 
string of the same length as the 
chain, for comparison 




(b) 





L 



x 



where S is the string constant. As before, we must take into account the condition 
Uo(?) = v n+ i (t) — 0 for the two points of suspension. In reality, the chain can 
perform longitudinal and transverse motions simultaneously and the two types of 
motion will be coupled. For the sake of simplicity, we restrict the discussion to 
purely transverse or purely longitudinal motions and do not consider mixed modes. 

In the first case we set co o = ~Jk/m and m(t) — Ui(t). In the second case we 
set &>o = «JS/md and </, (/) = Vj(t). In either case the Lagrangian function reads 



1 

L = 2 ,n ^ ( 9j+1 ~ qj ^\ (1 ' 5) 
.7=0 

with the conditions qo — qo — 0, q n +\ — q n +\ — 0. The equations of motion, 
which follow from (7.5), are 

qj = o>l {qj+ 1 - qj) - «o (<7/ - qj-\) , j = i « ■ (7.6) 

We solve these equations by means of the following substitution: 

qj{t) = A sin . (7.7) 

Obviously, we can let j run from 0 to n + 1 because qo and q n +\ vanish for all 
times. In (7.7) p is a positive integer. The quantities o) p are the eigenfrequencies of 
the coupled system (7.6) and could be determined by means of the general method 
developed in Practical Example 1 of Chap. 2. Here they may be obtained directly 
by inserting the substitution (7.7) into the equation of motion (7.6). One obtains 



co 



2 

P 



= 2 co 



2 

0 



1 — COS 
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or 



a>p — 2&>o sin 



pit 



,2(11+1), 

Hence, the normal modes of the system are 

q) P \t) = A (p) sin sin ( co p t ) 

with p — 1, n ; j = 0, 1, . . . , n + 1 
and the most general solution reads 

qj(t) = Y A<p) sin y^y\) sin P 1 + Vp) 

P= 1 ' 7 



(7.8) 



(7.9) 



where the amplitudes A (p> and the phases cp p are arbitrary integration constants. 
(As expected, the most general solution depends on 2 n integration constants.) 

Let us compare these solutions, for the example of transverse oscillations, to 
the normal modes of a vibrating string spanned between the same end points as 
the chain (see Fig. 7.1c). The length of the string is L = (n + I )d . Its state of 
vibration for the pih harmonic is described by 



(p{x, t) — A (p) sin ) sin { ( °p t ) 1 w p = P OJ 0 



(7.10) 



Here, co p is the p-fold of a basic frequency coq that we may choose such that it 
coincide with the frequency (7.8) of the chain for p = 1, viz. 



&>o = 2&>osin( ) . (7.11) 

\2(n + 1)/ 

The solution (7.10) is closely related to the solution (7.9); we shall work out the 
exact relationship in the next subsection. Here we wish to discuss a direct com- 
parison of (7.9) and (7.10)., 

At a fixed time t the amplitude of the normal mode (7.9), with a given p and 
with 1 ^ p ^ n. has exactly the same shape as the amplitude of the vibration 
(7.10) at the points x — jL/(n + 1) on the string. Figure 7.2 shows the example 
p — 2 for n — 7 mass points. The full curve shows the first harmonic of the vi- 
brating string; the points indicate the positions of the seven mass points according 
to the normal oscillation (7.9) with p = 2. (Note, however, that the frequencies 
(Op and pa>o are not the same.) 

The discrete system (7.9) has n degrees of freedom which, clearly, are count- 
able. The dynamical variables are the coordinates (t) and the corresponding 

momenta p { j P \t ) — mqj p ^(t). Time pays the role of a parameter. 

In the continuous system (7.10) we are interested in the local amplitude tp{x, t), 
for fixed time and as a function of the continuous variable x e [0, L], Thus, the 




7.2 Transition to the Continuous System 



423 




function <p over time t and position x on the string takes over the role of a dy- 
namical variable. 

If we suppose that the continuous system was obtained from the discrete sys- 
tem by letting the number of particles n become very large and their distance d 
correspondingly small, we realize that the variable x of the former takes over the 
role of the counting index j of the latter. This means, firstly, that the number of 
degrees of freedom has become infinite and that the degrees of freedom are not 
even countable. Secondly, the coordinate x, very much like the time t, has become 
a parameter. For given t — to the function (fix, to) describes the shape of the vi- 
bration in the space x e [0, L]; conversely, for fixed x — xq, (p(x o, t ) describes 
the motion of the string at that point, as a function of time. 



7.2 Transition to the Continuous System 



The transition from the discrete chain of mass points to the continuous string can 
be performed explicitly for the examples (7.2') and (7.4) of longitudinal or trans- 
verse vibrations. Taking the number of mass points n to be very large and their 
distance d to be infinitesimally small (such that (n + I )d — L stays finite), we 
have qj(t) = <p(x = jL/(n + 1), t) and 



<i.i + 1 - qj - 



and therefore 



3 <p 
dx 



dip 

d , q ; — q j—\ — — 
x=jd+d/2 J J dx 



x= jd—d/2 



d , 



(<7/+i - qj) ~ ( qj - qj- 1) - ^ 



x=jd 



The equation of motion (7.6) becomes the differential equation 



d 2< P _ 2 ,2 d2( P 

dt 2 ~ W ° d dx 2 ' 



(7.6') 



In the case of longitudinal vibrations, cc^d 2 — kd 2 /m. In the limit n —> oo the ra- 
tio m/d becomes the mass density q per unit length, while the product of the string 
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constant k and the distance d of neighboring points is replaced by the modulus of 
elasticity q — kd. With the notation v 2 — q/g equation (7. 6') reads 



3 2 tp(x,t) 2 d 2 ip(x,t) 

3 1 2 dx 2 



(7.12) 



This differential equation is the wave equation in one spatial dimension. 

In the case of transverse motion one obtains the same differential equation, 
with v 2 — S/q. The quantity v has the dimension of velocity. It represents the 
speed of propagation of longitudinal or transverse waves. 

In a next step let us study the limit of the Lagrangian function obtained in 
performing the transition to the continuum. The sum over the mass points is to be 
replaced by the integral over x, the mass m by the product gd . and the quantity 
ma) 2 (q j+ i - qj ) 2 by 






2 2 / d cp 

d~ — Qd ( — 



The infinitesimal distance d is nothing but the differential dx. Thus, we obtain 



L — f d x £ , 

Jo 



where 




(7.13) 



(7.14) 



the function C is called the Lagrangian density. In the general case, it depends on 
the field <p(x, t ), its derivatives with respect to space and time, and, possibly, also 
explicitly on t and x, i.e. it has the form 



C = C 



dip dip 

ip. — , — , x, t 
dx 3 1 



(7.15) 



The analogy to the Lagrangian function of point mechanics is the following. The 
dynamical variable q is replaced by the field ip, q is replaced by the partial deriva- 
tives dip/dx and dtp/dt, and the time parameter is replaced by the space and time 
coordinates x and t. The spatial coordinate now plays the same role as the time 
coordinate, and therefore a certain symmetry between the two types of coordinates 
is restored. 

We now turn to the question whether the equation of motion (7.12) can be ob- 
tained from the Lagrangian density (7.14) or, in the more general case, from (7.15). 
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Let C((p, dcp/dx, dcp/dt, x, r) be a Lagrangian density assumed to be at least C 1 
in the field <p and in its derivatives. Let L = f dx £ be the corresponding La- 
grangian function. For the sake of simpliticy we consider the example of one spa- 
tial dimension. The generalization to three dimensions is straightforward and can 
be guessed easily at the end. 

As (p is the dynamical variable, Hamilton’s variational principle now requires 
that the functional 



. ,def f , 

I[cp] = dr L — 

Jti 




(7.16) 



assumes an extreme value if cp is a physically possible solution. Like in the me- 
chanics of mass points one embeds the solution with given values <p(x, t\ ) and 
<p(x, h) at the end points in a set of comparative fields. In other words, one varies 
the field < p such that its variation vanishes at the times t\ and tj, and requires l[<p] 
to be an extremum. Let Sep denote the variation of the field, <p the time derivative 
and <p' the space derivative of <p. Then 



81[<p] = J^dt I 



, dC e d£ . d£. 

dx —Sep + —Sep + —Sep 

a<p d(p d<p r 



Clearly, the variation of a derivative is equal to the derivative of the variation, 



8<i> = wr( s< p) ’ S(p f = -^-(Scp) . 
at ax 

Furthermore, the field <p shall be such that it vanishes at the boundaries of the 
integration over x. By partial integration of the second term with respect to t and 
of the third term with respect to x, and noting that S(p vanishes at the boundaries 
of the integration, we obtain 



&I[cp] = 




dC 
3 cp 



3 

3 7 




3/3 C 



dx \3 <p' 



Sep . 



The condition 8I[ip] = 0 is to hold for all admissible variations 8q). Therefore, 
the expression in the curly brackets of the integrand must vanish. This yields the 
Euler-Lagrcinge equation for continuous systems (here in one space dimension). 



3£ 3 dC d 3 C 

dtp d 1 d(dcp/dt) dx d(d(p/dx) 



(7.17) 



We illustrate this equation by means of the example (7.14). In this example C does 
not depend on < p but only on <p and on <p’ . C does not depend explicitly on x or 
t, either. The variable x is confined to the interval [0, L], Both <p and 8q> vanish 
at the end points of this interval. We have 
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3 C dtp 3 C 2 ^(p 

d(d(p/dt) ~ 6 Jt ’ d{dtp/dx) ~ ~ V 6 Jx ’ 



and the equation of motion (7.17) yields the wave equation (7.12), as expected, 



3 2 tp 2 d 2( P 
3 1 2 dx 2 



0 . 



Its general solutions are tp+(x, t ) = f (x — vt ), tp~(x , t) — f (x + vt). with f{z) 
an arbitrary differentiable function of its argument z = x vt. The first of these 
describes a wave propagating in the positive x direction, the second describes a 
wave propagating in the negative x direction. As the wave equation is linear in 
the field variable tp, any linear combination of two independent solutions is also 
a solution. As an example we consider two harmonic solutions (i.e. two pure sine 
waves) with wavelength X and equal amplitude, 



(p + — A sin 



2tt 



(x — vt) 



<p_ — A sin 



2tt 



- (x + vt) 



Their sum 



cp — <p + + cp- = 2 A sin 





describes a standing wave. It has precisely the form of the solution (7.10) if 



2tc pjxx 
~Y X - ~L~ 



2 L 

or X — — , p — 1,2,... 
P 



The length L of the string must be an integer multiple of half the wavelength. The 
frequency of the vibration with wavelength X is given by 



2j rv 



— pcoo , 



7TV 

with a> o = — . 



(7.18) 



Thus, the transverse oscillations of our original chain of mass points are standing 
waves. Note also that their frequency (7.11) takes on the correct continuum value 
(7.18). Indeed, when the number n of mass points is very large, the sine in (7.11) 
can be replaced by its argument, 



coq = 2coq sin 
— 2&>o 



t r 



2 (n + 1) 



Tt 



Tt IT V 

= —COrid = , 

2 (n + 1) L L 



where we have set L — (n + ))d and replaced a>od by v. 

We conclude this subsection with another example in one time and three space 
coordinates. Let <p(x, t) be a real field and let the Lagrangian density be given by 




7.4 Canonically Conjugate Momentum and Hamiltonian Density 



427 




where // has the physical dimension of an inverse length. The generalization of 
(7.16) and (7.17) to three spatial dimensions is obvious. It yields the equation of 
motion 

3 C 3 3 C 

3 <p 3 1 d(dtp/dt) 



V— dC . =0. 

4^ dx l d(d(p/dx‘) 



In the example defined by (7.19) we obtain 



1 3 2 tp 
v 2 3 1 2 



— A<p + fi 2 (p — 0 , 



(7.21) 



where A — ^f =1 d 2 /(dx') 2 is the Laplacian operator. 

For /x = 0 (7.21) is the wave equation in three space dimensions. For the case 
ix 0 and the velocity v equal to the speed of light c, the differential equation 
(7.21) is called the Klein— Gordon equation. 



7.4 Canonically Conjugate Momentum 
and Hamiltonian Density 



The continuous field variable <p whose equation of motion is derived from the La- 
grangian density C is the analog of the coordinate variables q j of point mechanics. 
The canonically conjugate momenta are defined in (2.39) to be the partial deriva- 
tives of the Lagrangian function L with respect to qj . Following that analogy we 
define 



, .def 3£ 

nix)— 

d(d<p/dt) 



(7.22) 



For example, with C as given in (7.14) we find n{x) — Qtp(x). This is nothing but 
the local density of momentum for transverse vibrations of the string (or, likewise, 
longitudinal vibrations of a rubber band). Following the pattern of the definition 
(2.38) one constructs the function 



o(o(p/ot) 



and, by means of Legendre transformation, the Hamiltonian density 77. In the ex- 
ample (7.14). for instance, one finds 
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rL 

n describes the energy density of the vibrating system. Therefore, H — J 0 d.r n 
is the total energy of the system. For example, inserting the explicit solution (7.10) 
into the expression for 'H. one easily finds that 

H — -gA (p)2 (b\ p 2 . 

This is the total energy contained in the pth harmonic vibration. 



7.5 Example: The Pendulum Chain 

A generalization of the harmonic transverse oscillations of the n -point system of 
Sect. 7.1 is provided by the chain of pendulums shown in Fig. 7.3. It consists of 
n identical mathematical pendulums of length / and mass m which are suspended 
along a straight line and which swing in planes perpendicular to that line. They 
are coupled by means of harmonic forces in such a way that the torque acting be- 
tween the ith and the (i + 1 )th pendulum is proportional to the difference of their 
deviations from the vertical, i.e. is given by —k(cpi + 1 — (pi). The line of suspension 
may be thought of as being realized by a torsion bar. As the chain is fixed at its 
ends, we formally add two more, motionless pendulums at either end of the bar, 
to which we ascribe the numbers 0 and (n + 1). This means that the angles (po 
and <p n +\ are taken to be zero at all times. The kinetic and potential energies of 
this system are given by (cf. Sect. 1.17.2) 




Fig. 7.3. A chain of pendulums, which 
are coupled by harmonic forces. While the 
first three show small deviations from the 
vertical, pendulum number n has made al- 
most a complete turn 
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, n + 1 

T= -ml 2 



7=0 
n + 1 



U = mgl ^2 (1 - cos <pi) + -k (<Pi 



+ 1 - Vi) 



i = 0 



1=0 



(7.23) 



From the Lagrangian function in its natural form, L = T — U, and the Euler- 
Lagrange equations (2.28) we obtain the equations of motion 

Vi - col [OPl+i - Vi) ~ (<Pi - Vi- 1)] 

+ a> 2 sin (pi — 0 , i = 1, . . . , n . (7.24) 



Here we introduced the following constants 



2 ^ 2 

co 0 - m[2 , a q 



g 

l 



With g = 0 (7.24) is identical to (7.6). For k = 0 we recover the equation of 
motion of the plane pendulum that we studied in Sect. 1.17.2. 

Let the horizontal distance of the pendulums be d so that the length of the chain 
is L — ( n+l)d . We consider the transition to the corresponding continuous system 
by taking the limit n — »• oo, d -> 0. The countable variables cpiit), . . . , <p n (?) are 
replaced with the continuous variable <p(x, t), x taking over the role of the count- 
ing index, which runs from 1 to n. While the discrete system had n degrees of 
freedom, the continuous system has an uncountably infinite number of degrees of 
freedom. Let q — m/d be the mass density. The constant in the harmonic force 
is set equal to k = r)/d, i] being proportional to the modulus of torsion of the 
bar. When d tends to zero, k must formally tend to infinity in such a way that the 
product rj = kd stays finite. At the same time the quantity 



cold 2 



d kd 
m 1 2 




stays finite in that limit. In the same way as in Sect. 7.2, (7.24) becomes the equa- 
tion of motion 



3 "cp{x, t) 

3 7- 



9 3 ~cp{x,t) 9 

ir ^ h coi sin<p(.r, t) = 0 . 

dx z 



(7.25) 



This is the wave equation (7.12), supplemented by the nonlinear term co 2 sincp. 
Equation (7.25) is said to be the Sine-Gordon equation. In contrast to the wave 
equation (7.12) or to the Klein-Gordon equation (7.21) it is nonlinear in the field 
variable cp. It is the Euler-Lagrange equation (7.17) corresponding to the following 
Lagrangian density: 





— v 




— 2co\ (1 — cos cp) 



(7.26) 
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The latter may be obtained from (7.23) in the limit described above. Let us discuss 
solutions of (7.25) for two special cases. 

(i) The case of small deviations from the vertical. In its discrete form (7.24) this 
coupled system of nonlinear equations of motions can only be solved in closed form 
for the case of small deviations from the vertical. Taking sin (p\ — (fi\. we see that 
(7.24) becomes a linear system which may be solved along the lines of Practical 
Example 1 of Chap. 2, or following Sect. 7.1 above. In analogy to (7.9) we set 

< p\ p) (t ) = A ip) sin ( — — ) sin (co p t) , 

J \n + 1/ 

7=0,1, ...,«+ 1 (7.27) 



and obtain co p in terms of eo\ and co o, as follows: 



co — u>\ + 2 a>Q 



= to 2 + 4&)q sin 2 



cos 



pit 

n + 1 

pit 



2 (n + 1) 



(7.28) 



The corresponding solution of the continuous system (7.25), assuming small de- 
viations from the vertical, is obtained with sin<p(jc, t) ~ (p (x , t) and by making 
use of the results of Sect. 7.2. For large n (7.28) gives 



ar p ~co\ + 4 &)q 



pn 



2(n + 1) 




= wl + v 2 




so that (7.27) yields the pth harmonic oscillation 



cp (p) (x, t) — A (p) sin s i n (<w p i) 



with 



i i / PTC \- 2 i 2 - 1 

COp = tt>j + J V = CO l p co 0 

and with coq as in (7.18). 

(ii) Soliton solutions. For the continuous chain of infinite extension there are in- 
teresting and simple exact solutions of the equation of motion (7.25). Introducing 
the dimensionless variables 



def CO i def 

z= — x , r = co\t , 
v 

(7.25) takes the form 



3 2 <P(z, r) 



dr 2 



3 2 cp(z, t) 
3 z 2 



+ sin <p(z, r) — 0 . 



(7.25') 
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Furthermore, we take tp — 4arctan f(z, r). With / = tan(^>/4) and using the 
well-known trigonometric formulae 



sin(2.r) = 



2tan x 
1 + tan 2 x 



tan(2x) = 



2tanx 
1 — tan 2 * 



we obtain 



sin^ = 4/(l-/ 2 )/(l + / 2 ) 2 . 

From (7.25') follows a differential equation for / 





1 - f + 2 





= 0 . 



Finally, we set y = (z + ar)/Vl — a 2 , with a a real parameter in the interval 
— 1 < a < 1. If / is understood to be a function of y, the following differential 
equation is obtained: 




1 - f 2 + 2 




= 0 . 



It is not difficult to guess two simple solutions of the latter. They are f± = e, ±y . 
Thus, the original differential equation (7.250 has the special solutions 



<p±(z, r) = 4 arctan 




± 



Z+UT 1\ 

\/l — a 2 1/ 



(7.29) 



As an example choose the positive sign, take a — —0.5 and consider the time 
r = 0. For sufficiently large negative z the amplitude <p + is practically zero. For 
z = 0 it is <p. |_(0, 0) = 7T, while for sufficiently large positive z it is almost equal 
to 2n. In a diagram with z as the abscissa and (p+{z , r) the ordinate, this transition 
of the field from the value 0 to the value 2n propagates, with increasing time, in 
the positive "-direction and with the (dimensionless) velocity a. One may visualize 
the continuous pendulum chain as an infinitely long rubber belt whose width is / 
and which is suspended vertically. The process just described is then a flip-over of 
a vertical strip of the belt from <p — 0 to (p = 27r which moves with constant ve- 
locity along the rubber belt. This strange and yet simple motion is characteristic of 
the nonlinear equation of motion (7.25). It is called a soliton solution. Expressing 
the results in terms of the original, dimensionful variables x and t, one sees that 
the soliton moves with velocity va along the positive or the negative v: -direction. 
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7.6 Comments and Outlook 



So far we have studied continuous systems by means of examples taken from the 
mechanics of finitely many, say n, point particles, by letting n go to infinity. This 
limiting procedure is very useful for understanding the role of the fields <p(x, t) 
as the new dynamical variables which replace the coordinate functions q ( t ) of the 
mechanics of point particles. This does not mean, however, that every continuous 
system can be obtained by or could be thought of as the limit / — > oo of a discrete 
system with / degrees of freedom. On the contrary, the set of classical, continuous 
systems is much richer than one might expect on the basis of the examples studied 
above. Continuous systems form the subject of classical field theory , an important 
branch of physics in its own right. Field theory, for which electrodynamics is a 
prominent example, goes beyond the scope of this book and we can do no more 
than add a few comments and an outlook here. 

Let us suppose that the dynamics of N fields 

{<p'(x)|i = 1 , 2 ,..., iv} 



can be described by means of a Lagrange density C in such a way that the equations 
of motion that follow from it satisfy the postulate of special relativity (cf. Sect. 4.3), 
i.e., that they are form invariant with respect to Lorentz transformations. Assume, 
furthermore, that each of the fields <p' (x ) is invariant under Lorentz transformations 
of space-time, viz. 

<p n (x' = Ax) — (p 1 (x) with A € l\ . 



Fields possessing this simple transformation behavior are called scalar fields. The 
variational principle (7.16) is independent of the choice of coordinates (jc, t). In- 
deed, the hypersurface (x, t\ — const.) and (x. ti — const.) can be deformed into 
an arbitrary, smooth, three-dimensional, hypersurface E in space-time. In the ac- 
tion integral (7.16) one then integrates over the volume enclosed by E and chooses 
the variations &q>‘ of the fields such that they vanish on the hypersurface E . The 
form of the equations of motion (7.17) is always the same. This has an impor- 
tant consequence: Whenever the Lagrange density C is invariant under Lorentz 
transformations, the equations of motion (7.17) which follow from (7.16) ar e form 
invariant, i.e., they have the same form in every frame of reference. 

The Lagrange density (7.14) may serve as an example for a Lorentz-invariant 
theory, provided we replace the parameter v by the velocity of light c. The equation 
of motion which follows from it 



1 3 2 (p 3 2 cp 

c 2 3 1 2 3x 2 



(7.30) 



is form invariant. (This equation is the source-free wave equation.) With (p a 
scalar field it is even fully invariant itself. It is instructive to check this: With 
x'^ = A^vx" and x M = g^ v x v , and making use of the following simplified nota- 
tion for the partial derivatives 
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da := , 

M dx * 



d» := 






(7.31) 



one sees that (7.30) contains the expression d^d^cp. The transformation behavior 
of d v follows from the following calculation 



u v yv u u , , , 

9y = = = /l^y = A^ v d u . 

dx v dx v dx'f- dx'v ^ 

The transformation behavior of d v being the inverse of the above, the differential 
operator d„d v is a Lorentz invariant operator. It is often called the Laplace operator 
in four dimensions and is denoted by the symbol □, 



□ := d v d v 



i a 2 JG a 2 

c 2 dt 2 2 —‘ (dx 1 ) 2 

i = 1 




(7.32) 



where A is the Laplace operator in three dimensions. Note that the derivative terms 
in (7.14) as well as in (7.19) (taking v — c in either example) can be rewritten in 
the form of an invariant scalar product (3 /J y))(a M ^) of 

and 3 “ r = (!;£•-**) ■ 

Thus, a Lorentz invariant theory of our fields <p' could be designed by means 
of a Lagrange density of the form 

C(V\W) = \\Y,(W) (3V) 

Z l i= 1 

N 2 1 

-^k,-[^U)]“-f/(^(.^)) , (7.33) 

i= 1 J 

with U((p') a Lorentz scalar function of the fields. The first term on the right- 
hand side of (7.33) is the analog of the kinetic energy in the mechanics of point 
particles, the last term is the analog of the potential. The second term, which is 
new, is called mass term because in the quantized version of the theory it does 
indeed contain the rest masses of the particles which are described by the fields. 
Of course, it could equally well be considered as part of the potential U. 

In this discussion one recognizes, though in a sketchy manner only, an impor- 
tant building principle for classical field theories: Very much like in mechanics of 
point particles, symmetries and invariances can be read off, or can be built into, 
the Lagrange density C. Above we considered the example of form invariance with 
respect to Lorentz transformations. As the next step in a deeper and more detailed 
analysis one would derive the theorem of Emmy Noether, in its form adapted to 
field theory, which states that the energy, the momentum, or the angular momentum 
are conserved quantities whenever C is invariant under translations in time, transla- 
tions in space, or under rotations, respectively. A new feature is the appearence of 
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a local energy density (cf. the example studied in Sect. 7.4), and, analogously, mo- 
mentum and angular momentum densities. Noether’s theorem concerns local quan- 
tities. If the energy density changes locally, i.e., if it changes in a finite domain of 
space and time, then there must be continuity equation which guarantees that the 
total energy (i.e., the integral of the energy density over space) remains unchanged. 
Analogous statements apply to momentum and angular momentum densities. 

Finally, C may possess further, inner, symmetries which have to do with trans- 
formations on the fields. In this case there are additional conservation laws, or 
continuity equations, as shown by the following simple example. 

Given two real scalar fields and a Lagrange density of the form (7.33) which 
is such that Lj = 7.2 = 7. and where U depends on the sum of the squares of the 
fields only, 

c (V , \ (■ 9 ( 9 v) 

l ;=i 

In addition to being invariant under Lorentz transformations in space and time C 
is obviously invariant under orthogonal transformations of the fields as a whole, 
of the kind 

<p' ] (x) = (p ] (x) cos a — qr{x) sin a , 

<p' 2 (x) = (fi ] (x) sin a + <p 2 (x) cos a (7.35) 

with a e [0, 2tt], Equations (7.35) describe a formal rotation in the two-dimension- 
al, inner, space which is spanned by the independent fields (p 1 and (p 2 . In particular, 
if we choose the angle a to be infinitesimal, a — e, then (7.35) becomes 

8(p l := (p n — ip 1 — —eqr , 8cp 2 := <p' 2 — <p 2 = ecp 1 . (7.36) 



i= 1 

(7.34) 



As these changes in the fields are special cases of variations, one can calculate 
the corresponding change of C. Let us write (7.36) as S(p' — Sik(p k with 

£n = e 22 = 0 and — s \2 — £21 = £• Then we find 



“ = E('0V + 



i = 1 



dC 

3(3 nV) 



n Sd 



r 2 



= 3 ,, 



dC 



-,S 






= 3 , 



where we have replaced dC/dq> 1 in the first term by 



dC dC 

^ 3 ( 3 „<P‘) 
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making use of the equations of motion. The right-hand side of the above equation 
is nothing but a divergence in four dimensions, the quantity e j 11 being defined by 
the expression in square brackets. The left-hand side vanishes because the change 
of the Lagrange density is zero, SC — 0. It is not difficult to show that j 1 ' is a four- 
vector with respect to Lorentz transformations. In the concrete example considered 
here one calculates the explicit form of this vector from the Lagrange density (7.34) 
and making use of the formulae (7.36). The result is 

Fix) = (aVw) <p l w - (aV(*)) <p 2 M ■ (7.37) 

The statement that the four-divergence of the quantity j 1 ' vanishes, in fact, is a 
continuity equation. The time component j° and the space components j have the 
same physical dimension. Therefore, if j is a current density, that is, if it has dimen- 
sion, e.g., charge x velocity per unit of volume, then j° is not yet a density, which, 
in our example, should have dimension charge per unit volume. However, g (x , t) — 
j°/c is a density with the correct physical dimension. Therefore, in a given frame 
of reference, we set = (eg, j ) so that the continuity equation becomes 

3/*/* = + V • j(x, t) = 0 . (7.38) 

dt 

When the density q in a given, finite, space volume increases or decreases, this 
change is compensated by a flow of charge into this volume, or out of this volume. 
The total charge contained in the fields is given by the integral of the density q 
over the entire space. Provided the fields and, therefore, also the current density 
j vanish sufficiently fast at infinity, equation (7.38) implies that the total charge 
Q ■= f d 3 xg(x, t ) is a constant of the motion, 

— 2= — [ d 3 xg(x ,t) = - ( d 3 .rV • j(x, t) — 0 . (7.39) 

dt dr J J 

Indeed, the right-hand side of this equation vanishes because the volume integral 
of the divergence over space equals the surface integral of the radial component 
of j over the surface at infinity 1 . 

In the example (7.34) invariance with respect to the transformations (7.35) leads 
to the conservation law (7.38), or (7.39), with j M (.r) as given by the expression 
(7.37). It is useful to replace the real fields (p l and qr by a complex field and its 
complex conjugate, through the definitions 




1 One shows, furthermore, that the charge Q is a Lorentz invariant quantity, i.e., that its value 

does not depend on the frame of reference in which it is calculated. This holds if and only if 
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The Lagrange density (7.34) then takes a simpler form, namely 

c = (d„<t>*) (a'V) - ),(p*(i> - u {<p*ct>) . (7.40) 

Similarly, the transformation (7.35) simplifies to 

( p'(x ) = (p(x)t la , <p'* (x) — (f>*(x)e~ la , (7.41) 



and the quantity (7.37) becomes 

j ll (x) = -i [ f //(x)9"0(x) - (d^4>*{x)) <f>(x )] . (7.42) 

In quantum physics one learns that, indeed, this expression is a suitable candidate 
for the description of the electric charge and current densities of a scalar particle. 




Exercises 



Chapter 1: Elementary Newtonian Mechanics 



1.1 Under the assumption that the orbital angular momentum l — r x p of a 
particle is conserved show that its motion takes place in a plane spanned by ro, 
the initial position, and po, the initial momentum. Which of the orbits of Fig. 1 
are possible in this case? ( O denotes the origin of the coordinate system.) 




1.2 In the plane of motion of Exercise 1 . 1 introduce polar coordinates {r (r), (p(t)\. 
Calculate the line element (ds) 2 = (d.v) 2 + (dv) 2 , as well as v 2 — x 2 + y 2 and 
l 2 , in the polar coordinates. Express the kinetic energy in terms of r and l 2 . 

1.3 For the description of motions in M 3 one may use Cartesian coordinates r(t) — 
{.r(f), y(f), z(t)}, or spherical coordinates (r(t), 9(t), (p(t )}. Calculate the infinites- 
imal line element (ds) 2 = (d.v) 2 + (dv) 2 + (dz) 2 in spherical coordinates. Use this 
result to derive the square of the velocity v 2 — x 2 + y 2 + z 2 in these coordinates. 

1.4 Let e x , e y , e- be Cartesian unit vectors. They then fulfill e 2 = e 2 — e 2 — 1, 
e x ■ e y — e x ■ e- = e Y ■ e- = 0, e- = e x x e x (plus cyclic permutations). Introduce 
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three, mutually orthogonal unit vectors e r , e v , eg as indicated in Fig. 2, Determine 
e r and e v from the geometry of this figure. Confirm that e r ■ e v = 0. Assume 
eg — ae x + /3e y + ye- and determine the coefficients a, ft, y such that e2 = 1, 
eg ■ e v — 0 = eg ■ e r . Calculate v = r — d(re r )/dt in this basis as well as v 2 . 

1.5 A particle is assumed to move according to r(t) = v°t with v° = {0, v, 0}, 
with respect to the inertial system K. Sketch the same motion as seen from another 
reference frame K\ which is rotated about the "-axis of K by an angle 0, 

x' = x cos 0 + y sin tP, 
y' = —x sin 0 + y cos 0 , t! = z, 
for the cases 0 — m and 0 — cot , were a> is a constant. 

1.6 A particle of mass m is subject to a central force F = F(r)r/r. Show that 
the angular momentum / = mr x r is conserved (i.e. its magnitude and direction) 
and that the orbit lies in a plane perpendicular to l. 

1.7 (i) In an A'-particle system that is subject to internal forces only, the potentials 
Vik depend only on the vector differences rn- — r, — rg, but not on the individual 
vectors r,-. Which quantities are conserved in this system? 

(ii) If Vjk depends only on the modulus the force acts along the straight line 
joining i to k. There is one more integral of the motion. 

1.8 Sketch the one-dimensional potential 

U{q) = — 5qe~ q + q~ 4 + 2/q for q > 0 

and the corresponding phase portraits for a particle of mass m — I as a function 
of energy and initial position qo. In particular, find and discuss the two points of 
equilibrium. Why are the phase portraits symmetric with respect to the abscissa? 
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1.9 Study two identical pendula of length / and mass m, coupled by a harmonic 
spring, the spring being inactive when both pendulums are at rest. For small de- 
viations from the vertical the energy reads 

E — —— (x? + xh + - mcohx ? + x?) + -marAx i - X3) 2 
2m ~ 2 J 2 

with x -2 — mx 1 , X4 — mx 3 . Identify the individual terms of this equation. Derive 
from it the equations of motion in phase space. 




The transformation 

x — > 11 — Ax with A 



11 = 



V2 V 11 




and 



decouples these equations. Write the equations obtained in this way in dimension- 
less form and solve them. 



1.10 The one-dimensional harmonic oscillator satisfies the differential equation 



mx(t) — —Xx(t ) , 



( 1 . 1 ) 



with m the inertial mass, X a positive constant, and x(t) the deviation from equi- 
librium. Equivalently, (1.1) can be written as 

x + o?x — 0, or X/m . (1.2) 

Solve the differential equation (1.2) by means of x(t) = acos(/xt) +hsin(/ut) for 
the initial condition 



x(0) = x'o and p( 0) = mx( 0) = po ■ (1.3) 

Let x(t) be the abscissa and pit) the ordinate of a Cartesian coordinate sys- 
tem. Draw the graph of the solution with a> — 0.8 that goes through the point 
(*0 = 1 , P 0 = 0 ). 

1.11 Adding a weak friction force to the system of Exercise 1.10 yields the equa- 
tion of motion 

x + kx + arx = 0 . 

“Weak” means that k < 2o>. Solve the differential equation by means of 
x(t) — e a, [xo cos cot + ( po/md >) sin<at] . 

Draw the graph (x(t), p(t)) of the solution with a> = 0.8 which goes through 
(•*0 = 1, Po = 0). 
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1.12 A mass point of mass m moves in the piecewise constant potential (see 
Fig. 3) 

U = \ U 1 for x< 0 

I U 2 forx > 0. 

In crossing from the domain x < 0, where its velocity was V\ , to the domain x > 0 , 
it changes its velocity (modulus and direction). Express U2 in terms of the quanti- 
ties U] , v\ , ai, and 012 ■ What is the relation of a\ to 012 when (i) C/ 1 < U2 and (ii) 
U\ > U2 ? Work out the relationship to the law of refraction of geometrical optics. 

Hint : Make use of the principle of energy conservation and show that one com- 
ponent of the momentum remains unchanged in crossing from x < 0 to x > 0 . 




I 

I 

*0 Fig. 4. 



1.13 In a system of three mass points m 1, m2, m 3 let .S' 12 be the center-of-mass of 
1 and 2 and S the center-of-mass of the whole system. Express the coordinates r \ , 
r2, r~\ in terms of r s , s a , and s*, as defined in Fig. 4. Calculate the total kinetic en- 
ergy in terms of the new coordinates and interpret the result. Write the total angular 
momentum in terms of the new coordinates and show that JA /; = l s + l a + lb, 
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where l s is the angular momentum of the center-of-mass and l a and lb are rela- 
tive angular momenta. By considering a Galilei transformation r' = r + a>t + a, 
t’ — t + s show that l s depends on the choice of the origin, while l a and //, do not. 

1.14 Geometric similarity. Let the potential U (r) be a homogeneous function of 
degree a in the coordinates (x, y, z), i.e. U(Xr ) = X a U(r). 

(i) Show by making the replacements r -> Xr and t -> /it, and choosing 
fi — L 1_ “/ 2 , that the energy is modified by a factor X a and that the equation 
of motion remains unchanged. 

The consequence is that the equation of motion admits solutions that are ge- 
ometrically similar, i.e. the time differences (At) a and (At)/, of points that cor- 
respond to each other on geometrically similar orbits (a) and (b) and the corre- 
sponding linear dimensions L a and L/, are related by 

(At) b = /L,y-“ /2 
(At) a \L a ) 

(ii) What are the consequences of this relationship for 

- the period of harmonic oscillation? 

- the relation between time and height of free fall in the neighborhood of the 
earth’s surface? 

- the relation between the periods and the semimajor axes of planetary ellipses? 

(iii) What is the relation of the energies of two geometrically similar orbits for 

- the harmonic oscillation? 

- the Kepler problem? 



1.15 The Kepler problem, (i) Show that the differential equation for ( P(r), in the 
case of finite orbits, has the following form: 

d0 1 / /'p/'A 

= -/ — , (1.4) 

dr r y (r — rp)(rA — r) 



where rp and r ; \ denote the perihelion and the aphelion, respectively. Calculate rp 
and /’a and integrate (1.4) with the boundary condition 0 (rp) = 0. 

(ii) Change the potential to U(r) — (—A/r) + (B/r 2 ) with |ZJ| 1 2 /2/j,. De- 
termine the new perihelion r' p and the new aphelion r' A and write the differential 
equation for <t>(r) in a form analogous to (1.4). Integrate this equation as in (i) 
and determine two successive perihelion positions for B > 0 and for B < 0. 

Hint: 



d /a 

— arccos ( — b p 
ax V x 



a 1 

x y/x 2 (l — yS 2 ) — 2txj5x — a 2 
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1.16 The most general solution of the Kepler problem reads, in terms of polar 
coordinates r and 0, 

r(tf>) = - . 

1 + e cos ((p — </>o) 

The parameters are given by 
l 2 

p = , (A = Gm\ni 2 ) , 

Apt 



What values of the energy are possible if the angular momentum is given? Calcu- 
late the semimajor axis of the earth’s orbit under the assumption m s U n m Barth ; 

G = 6.672 x 10“ 11 m 3 kg -1 s -2 , 
m s = 1.989 x 10 30 kg, 
m E = 5.97 x 10 24 kg . 

Calculate the semimajor axis of the ellipse along which the sun moves about the 
center-of-mass of the sun and the earth and compare the result to the solar radius 
(6.96 x 10 8 m). 

1.17 Determine the interaction of two electric dipoles p\ and p 2 (example for 
noncentral potential force). 

Hints: Calculate the potential of a single dipole p i, making use of the following 
approximation. The dipole consists of two charges ±e\ at a distance d \ . Let e\ 
tend to infinity and \d \\ to zero, in such a way that their product p i — d \e\ stays 
constant. Then calculate the potential energy of a finite dipole pi in the field of 
the first and perform the same limit e 2 — > oo, \d 2 \ — > 0, with pi — d 2 e 2 constant, 
as above. Calculate the forces that act on the two dipoles. 

Answer: 

W( 1, 2) = (pi ■ p 2 ) /r 3 - 3 (pi ■ r ) ( p 2 ■ r) / r 5 , 

F — —ViW = [3(/7i • p 2 )/r 5 
-15(pi ■ r)(p 2 ■ r)/r 7 \r 
+3[pi(p 2 ■ r ) + p 2 (p\ ■ r)\/r 5 = -F l2 . 

1.18 Let the motion of a point mass be governed by the law 

v — v x a , a — const . (1.5) 

Show that r ■ a — v(0) ■ a holds for all t and reduce (1.5) to an inhomogeneous 
differential equation of the form r + arr — f(t). Solve this equation by means 
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of the substitution r m h< m i (?) — ct + d. Express the integration constants in terms 
of the initial values r( 0) and v(0). Describe the curve r(t) — rhom(f) + f'inhom(0- 

Hint : 

Cl\ X (a 2 x a 3 ) = a2(«l • «3) — «3(«1 • «2) ■ 

1.19 An iron ball falls vertically onto a horizontal plane from which it is reflected. 
At every bounce it loses the nth fraction of its kinetic energy. Discuss the orbit 
x — x(t) of the bouncing ball and derive the relation between .v max and f max . 

Hint : Study the orbit between two successive bounces and sum over previous times. 

1.20 Consider the following transformations of the coordinate system: 

{t, r} — »■ [t, r}, {f, r} — >■ [t, — r}, If, r} ^ f, r}, 

E p T 

as well as the transformation PT that is generated by performing first T and then P. 
Write these transformations in the form of matrices that act on the four-component 
vector (!_). Show that {E, P, T, PT} form a group. 

1.21 Let the potential U (r) of a two-body system be C 2 (twice continuously 
differentiable). For fixed relative angular momentum, under which additional con- 
dition on U (r) are there circular orbits? Let Eq be the energy of such an orbit. 
Discuss the motion for E — Eq + s for small positive s. Study the special cases 

U (r) = r" and U(r) = X/r. 

1.22 Following the methods explained in Sect. 1.26 show the following. 

(i) In the northern hemisphere a falling object experiences a southward deviation 
of second order (in addition to the first-order eastward deviation). 

(ii) A stone thrown vertically upward falls down west of its point of departure, the 
deviation being four times the eastward deviation of the falling stone. 

1.23 Let a two-body system be subject to the potential 

a 

U(r) = » 

r- 

in the relative coordinate r, with positive a. Calculate the scattering orbits r(<P). 
For fixed angular momentum what are the values of a for which the particle makes 
one (two) revolutions about the center of force? Follow and discuss an orbit that 
collapses to r = 0. 

1.24 A pointlike comet of mass m moves in the gravitational field of a sun with 
mass M and radius R. What is the total cross section for the comet to crash on 
the sun? 
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1.25 Solve the equations of motion for the example of Sect. 1.21.2 (Lorentz force 
with constant fields) for the case 

B — Be z , E = Ee- . 



Chapter 2: The Principles of Canonical Mechanics 

2.1 The energy E(q, p) is an integral of a finite, one-dimensional, periodic mo- 
tion. Why is the portrait symmetric with respect to the q- axis? The surface enclosed 
by the periodic orbit is 

r r <?max 

F(E ) = ® p dq — 2 I pdq . 

J ^ <?min 

Show that the change of F(E) with E equals the period T of the orbit, T — 
dF(E)/dE , Calculate F and T for the example 

E(q, p) — p 2 /2m + mu> 2 q 2 / 2 . 



2.2 A weight glides without friction along a plane inclined by the angle a with 
respect to the horizontal. Study this system by means of d’Alembert’s principle. 

2.3 A ball rolls without friction on the inside of a circular annulus. The annulus 
is put upright in the earth’s gravitational field. Use d’Alembert’s principle to derive 
the equation of motion and discuss its solutions. 

2.4 A mass point m that can only move along a straight line is tied to the point 
A by means of a spring. The distance of A to the straight line is l (cf. Fig. 5). 
Calculate (approximately) the frequency of oscillation of the mass point. 

2.5 Two equal masses m are connected by means of a (massless) spring with 
spring constant x. They move without friction along a rail, their distance being 
/ when the spring is inactive. Calculate the deviations .ri(r) and xj (t) from the 
equilibrium positions, for the following initial conditions: 

x i (0) = 0 , ii(0) = no, 

* 2 ( 0 ) = /, * 2 ( 0 ) = 0 . 



2.6 Given a function F{x \, . . . , Xf) that is homogeneous and of degree N in its 
/ variables, show that 



E 



i=l 




= NF . 



2.7 If in the integral 



-f 



I[y]= I dx f(y,y') 

'xi 
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A 




m Fig. 5. 



/ does not depend explicitly on x, show that 



y'jy - f(y,y') = const. 

Apply this result to L(q,q) — T — U and identify the constant. T is assumed to 
be a homogeneous quadratic form in q. 

2.8 Solve the following two problems (whose solutions are well known) by means 
of variational calculus: 

(i) the shortest connection between two points (xi, y i ) and (X 2 , )’ 2 ) in the Euclidean 
plane; 

(ii) the shape of a homogeneous, fine-grained chain suspended at its end points 
(x\ , yi) and (X 2 , >’ 2 ) in the gravitational field. 

Hints : Make use of the result of Exercise 2.7. The equilibrium shape of the chain is 
determined by the lowest position of its center of mass. The line element is given by 

d.v = ^ (Ax) 2 + (dy) 2 = + y' 2 dx . 

2.9 Two coupled pendula can be described by means of the Lagrangian function 
L — \ m (x 2 + x 2 ) — 1 ; iiicOq (x 2 + x 2 ) — \m (ca 2 — &>q) (x\ — X 2 ) 2 . 

(i) Show that the Lagrangian function 

L' = \m(x\ — \a>$x\) 2 + \m(x 2 — iwo^) 2 
-\m (u>\ - col) _ x 2) 2 

leads to the same equations of motion. Why is this so? 

(ii) Show that transforming to the eigenmodes of the system leaves the Lagrange 
equations form invariant. 

2.10 The force acting on a body in three-dimensional space is assumed to be 
axially symmetric with respect to the 2 -axis. Show that 

(i) its potential has the form U — U (r, z), where {r, cp, z} are cylindrical coordi- 
nates. 




